Binary resource inclusion like it's 1989
by dweller - 2024-07-30
So C23 is adding #embed
preprocessor command which can be useful to embed resources into your
binary. It looks something like this:
1
2
3
4
const char image[] =
{
#embed "image.tga"
};
It also has some niceties like if_empty
so you can embed some default data if file is empty or
non-existent (it’s an assumption about the latter.) Check out
cppreference.com for more information,
or get latest C23 draft as of writing this
post.
In any case, during the writing of this entry to my log, there are no compilers that support this.
1
2
3
4
5
6
7
$ gcc -std=c2x -Wall -Wextra -pedantic test.c
test.c:5:6: error: invalid preprocessing directive #embed
5 | #embed "image.tga"
| ^~~~~
test.c:3:12: error: zero or negative size array 'image'
3 | const char image[] =
| ^~~~~
Not only that, this would be useful for people like me who are either stuck with, or intentionally use older standards like C99 or even C89 (like me most of the time.)
While conversing with my friend about adding an embed-like feature to his programming language he said:
you mean #include? :)
Thus the idea was born. #include
indeed just embeds a file into your source code. Alas, the
compiler will try to parse the raw binary and won’t be happy:
1
2
3
4
const char image[] =
{
#include "image.tga"
};
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
$ cc -std=c89 -Wall -Wextra -pedantic test2.c
...
image.tga:13:14: error: stray '\377' in program
13 | <U+0000><U+0000><U+0000><U+0016><U+000C><d9><ff><U+000F><U+000B><e3><ff><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0004>Z<f7><ff><U+0002>m<fa><ff><U+0000><U+0000><U+0000><U+0000><U+000B><9d><fc><ff><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000>?
| ^~~~
image.tga:13:15: warning: null character(s) ignored
13 | <U+0000><U+0000><U+0000><U+0016><U+000C><d9><ff><U+000F><U+000B><e3><ff><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0004>Z<f7><ff><U+0002>m<fa><ff><U+0000><U+0000><U+0000><U+0000><U+000B><9d><fc><ff><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000>?
| ^~~~~~~~
image.tga:13:23: error: stray '\4' in program
13 | <e3><ff><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0004>Z<f7><ff><U+0002>m<fa><ff><U+0000><U+0000><U+0000><U+0000><U+000B><9d><fc><ff><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000>?
| ^~~~~~~~
image.tga:13:25: error: stray '\367' in program
13 | <U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0004>Z<f7><ff><U+0002>m<fa><ff><U+0000><U+0000><U+0000><U+0000><U+000B><9d><fc><ff><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000>?
| ^~~~
...
Well, let’s make the compiler happy. All we need to do is do some preprocessing before the C preprocessor. Prepreprocessing if you will.
My first idea was simple and worked out of the box, with a small nuance. Let’s just read a binary
file and output its escaped bytes and cover it all in quotes, finished with a semicolon. It is
easily done with printf(3)’s %x
conversion specifier. (See the code in bin2c.c file below)
1
2
3
const char image[] =
#include "image.tga"
This works, but as I mentioned above, it has one itty-bitty problem:
1
2
3
... warning: string length '21000' is greater than the length '509' ISO C90 compilers are required to
support [-Woverlength-strings]
You live, you learn. Apparently ISO C89/C90 compilers are not required to handle strings literals
larger than 509 characters long. GCC 13.2.0 seems to be handing it well, but I wanted to be in spec.
As such, I simply output a char
literal for each byte. This sadly makes the post-processed file
larger.
bin2c.c:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
#define _XOPEN_SOURCE 500
#include <stdlib.h>
#include <stdio.h>
#include <errno.h>
int main(int argc, char** argv)
{
int rc = 0;
int ret = EXIT_SUCCESS;
FILE* in = stdin;
FILE* out = stdout;
const char* name = "stdin";
char buffer[4096] = {0};
size_t got = 0;
if(argc == 2)
{
name = argv[1];
in = fopen(name, "r");
if(!in)
{
perror("fopen");
exit(EXIT_FAILURE);
}
rc = snprintf(buffer, sizeof(buffer) - 1, "%s.h", name);
if(rc < 0)
{
perror("snprintf");
exit(EXIT_FAILURE);
}
out = fopen(buffer, "w");
if(!out)
{
perror("fopen");
fclose(in);
exit(EXIT_FAILURE);
}
}
else if(argc > 2) fprintf(stderr, "warning: ignoring excess paramters\n");
for(;;)
{
size_t i;
got = fread(buffer, 1, sizeof(buffer), in);
rc = ferror(in);
if(rc)
{
perror("fread");
ret = EXIT_FAILURE;
goto end;
}
for(i = 0; i < got; i++) fprintf(out, "'\\x%02x',", (unsigned char)buffer[i]);
rc = feof(in);
if(rc) break;
}
fprintf(out, "\n");
end:
fclose(in);
fclose(out);
return ret;
}
As you can see, the simple program just creates a header file with the same name as the input file. It also can just read from standard in, so you can chain it with pipes in scripts. I also wrote it to only depend on standard C library so anyone can use it. You are free to steal this.
With this, we finally can #include
files in our source code to embed them in the binary.
To demonstrate this, I wrote a simple program with an embedded
TGA file that it prints to standard out using
ANSI escape sequences for color. For this code
to work, your terminal has to support 24-bit True Color sequences.
example.c:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
#include <stdio.h>
#include "common.h"
#include "tga.c"
#include "cli.c"
const u8 image[] =
{
#include "image.tga.h"
};
int main(void)
{
texture tex = {0};
tga2tex_from_mem(&tex, image, sizeof(image));
cli_draw_tex(&tex, true); /* true - Black&White, false - True Color */
return 0;
}
Here’s an example in black and white.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
$ ls
bin2c.c cli.c example.c tga.c common.h image.tga
$ cc -std=c89 -Wall -Wextra -pedantic bin2c.c -o bin2c
$ ./bin2c image.tga
$ cc -std=c89 -Wall -Wextra -pedantic example.c -o example
$ ./example
####
####################
########## ####
########## ##
########## ## ##
########## ## ##
######## ##
############################
#### ## ##
## #### #### ##
#### ## #### #### ##
## #### #### ##
## #### ##
$
And here’s a screenshot in True Color:
Success! You can easily add bin2c
to your Makefile or any other build script and have it
generate embeddable files that you can embed in the source. Ez pz, no need for C23! ;)
P.S. Interested in the rest of the owl? You can check it out at my git repository. It’s pretty barebones, and doesn’t handle all TGA files properly, only the non-RLE with ARGB channels in that order. But, what did you expect for just an example?
¯\_(ツ)_/¯