Embedding binary blobs using gcc mingw
Asked Answered
F

4

56

I am trying to embed binary blobs into an exe file. I am using mingw gcc.

I make the object file like this:

ld -r -b binary -o binary.o input.txt

I then look objdump output to get the symbols:

objdump -x binary.o

And it gives symbols named:

_binary_input_txt_start
_binary_input_txt_end
_binary_input_txt_size

I then try and access them in my C program:

#include <stdlib.h>
#include <stdio.h>

extern char _binary_input_txt_start[];

int main (int argc, char *argv[])
{
    char *p;
    p = _binary_input_txt_start;

    return 0;
}

Then I compile like this:

gcc -o test.exe test.c binary.o

But I always get:

undefined reference to _binary_input_txt_start

Does anyone know what I am doing wrong?

Freehold answered 13/4, 2010 at 4:38 Comment(4)
By the way, I was unaware of this method of pulling arbitrary data into an executable - nice.Krouse
What does this method offer that's not offered by .rc files?Booker
@Booker Easier access to contntent. It does not need calls to any Resource API:sHulbert
also github.com/graphitemaster/incbinCsch
K
40

In your C program remove the leading underscore:

#include <stdlib.h>
#include <stdio.h>

extern char binary_input_txt_start[];

int main (int argc, char *argv[])
{
    char *p;
    p = binary_input_txt_start;

    return 0;
}

C compilers often (always?) seem to prepend an underscore to extern names. I'm not entirely sure why that is - I assume that there's some truth to this wikipedia article's claim that

It was common practice for C compilers to prepend a leading underscore to all external scope program identifiers to avert clashes with contributions from runtime language support

But it strikes me that if underscores were prepended to all externs, then you're not really partitioning the namespace very much. Anyway, that's a question for another day, and the fact is that the underscores do get added.

Krouse answered 13/4, 2010 at 5:47 Comment(8)
Wow... thanks alot. This was driving me mad. I knew it must have been something simple. I have just debugged it and noticed that it was changing to __binary_input_txt_startFreehold
@myforwik: just in case you're interested, I've post a question asking why C does this: #2628011Krouse
@Michael: The article's claim is true. The runtimes were written in assembler, which was free to use names without underscores prepended and could thereby be assured not to clash with any symbols defined in the C code, and conversely the C code had no way to access the symbols from the asm runtime code.Stavros
Does anyone know how much data that can be embedded that way?Hulbert
The underscore stands for a reserved name doesn't it; I assume it's to avoid a clash with a hand-written code referencesAngkor
On the contrary, in my case, I do have to add the underscore. Meaning, the initial code in the question works for me. Uname "Linux aditya-lucid 2.6.32-40-generic #87-Ubuntu SMP Mon Mar 5 20:26:31 UTC 2012 i686 GNU/Linux", using "gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5.1)"Prager
@aditya: perhaps there's a difference in that detail that depends on the target? Windows toolchains have tendency to automatically add underscores to external names when targeting Win32 x86. I wouldn't be surprised if that doesn't happen for other targets (even Win32 x64).Krouse
@MichaelBurr : Hmm...interesting topic anyway, and useful as well...much to learn :)Prager
S
9

From ld man page:

--leading-underscore

--no-leading-underscore

For most targets default symbol-prefix is an underscore and is defined in target's description. By this option it is possible to disable/enable the default underscore symbol-prefix.

so

ld -r -b binary -o binary.o input.txt --leading-underscore

should be solution.

Snowdrop answered 5/4, 2015 at 11:54 Comment(0)
B
6

I tested it in Linux (Ubuntu 10.10).

  1. Resouce file:
    input.txt

  2. gcc (Ubuntu/Linaro 4.4.4-14ubuntu5) 4.4.5 [generates ELF executable, for Linux]
    Generates symbol _binary__input_txt_start.
    Accepts symbol _binary__input_txt_start (with underline).

  3. i586-mingw32msvc-gcc (GCC) 4.2.1-sjlj (mingw32-2) [generates PE executable, for Windows]
    Generates symbol _binary__input_txt_start.
    Accepts symbol binary__input_txt_start (without underline).

Basil answered 12/10, 2012 at 23:21 Comment(1)
Using tdm-gcc 4.8.1, I must refer to the variables using the underscore.Immesh
T
0

Apparently this feature is not present in OSX's ld, so you have to do it totally differently with a custom gcc flag that they added, and you can't reference the data directly, but must do some runtime initialization to get the address.

So it might be more portable to make yourself an assembler source file which includes the binary at build time, a la this answer.

Turnbuckle answered 25/5, 2015 at 23:22 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.