objcopy prepends directory pathname to symbol name
Asked Answered
T

5

15

I am tying to use objcopy to include a binary form of a text file into an executable. (At runtime I need the file as a string). This works fine until the linker needs to find the references from the symbol names. The problem is that objcopy prepends the symbol names with the pathname to the file. Since I am using GNU Autotools to ship the package this prepended pathname changes and I don't know what external linker symbol to use in the C/C++ program.

nm libtest.a |grep textfile
textfile.o:
00001d21 D _binary__home_git_textfile_end
00001d21 A _binary__home_git_textfile_size
00000000 D _binary__home_git_textfile_start

libtest.a was produced with (extract from Makefile.am):

SUFFIXES = .txt
.txt.$(OBJEXT):
    objcopy --input binary --output elf32-i386 --binary-architecture i386 $< $@

How can I tell objcopy to only us the stem of the filename as linker symbols? Or is there another way around the problem?

Tetragram answered 24/3, 2013 at 3:53 Comment(0)
C
12

Generic method of including raw data into ELF is supported by .incbin assembler directive.

The trick is to create template .S file that could look like this:

        .global foo_start
foo_start:
        .incbin "foo.raw"

        .global foo_end
foo_end:    

This file is preprocessed via cpp so we don't have to hardcode file name there, eg. we can write:

        .incbin __raw_file_path__

... and then pass it while compiling:

gcc -D__raw_file_path__='"data/foo.png"' foo.S -c -o data/foo.o

Lastly, as we prepare .S file ourself we can add some extra data and/or information. If you include raw "text files" and want these to be available as C strings you can add '0' byte just after raw data:

        .global foo_start
foo_start:
        .incbin "foo.raw"

        .global foo_end
foo_end:    
        .byte 0

        .global foo_size
foo_size:
        .int foo_end - foo_start

If you want full-blown flexibility, you can of course pre-process file manually to alter any part of it, eg.

.global @sym@_start
@sym@_start:
       .incbin "@file@"
       .global @sym@_end
@sym@_end:

... and then compile it:

sed -e "s,@sym@,passwd,g" -e "s,@file@,/etc/passwd," <foo.S.in | gcc -x assembler-with-cpp - -o passwd.o -c
Chemisorb answered 23/8, 2013 at 23:42 Comment(0)
E
11

Somewhat ironically you can use objcopy to solve the problem via the --redefine-sym option that allows renaming of symbols...

If I use objcopy to create an object file from a PNG in another directory:

$ objcopy -I binary -O elf64-x86-64 -B i386 --rename-section .data=.rodata,alloc,load,data,contents,readonly ../../resources/test.png test_png.o

The resulting object has the following symbols:

$readelf -s test_png.o -W

Symbol table '.symtab' contains 5 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND
     1: 0000000000000000     0 SECTION LOCAL  DEFAULT    1
     2: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT    1 _binary_______resources_test_png_start
     3: 0000000000003aaa     0 NOTYPE  GLOBAL DEFAULT    1 _binary_______resources_test_png_end
     4: 0000000000003aaa     0 NOTYPE  GLOBAL DEFAULT  ABS _binary_______resources_test_png_size

These can then be renamed:

$objcopy --redefine-sym _binary_______resources_test_png_start=_binary_test_png_start test_png.o
$objcopy --redefine-sym _binary_______resources_test_png_size=_binary_test_png_size test_png.o
$objcopy --redefine-sym _binary_______resources_test_png_end=_binary_test_png_end test_png.o

Resulting in an object with the symbol names that objcopy would have generated if the PNG had been located in the current directory:

$readelf -s test_png.o -W

Symbol table '.symtab' contains 5 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND
     1: 0000000000000000     0 SECTION LOCAL  DEFAULT    1
     2: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT    1 _binary_test_png_start
     3: 0000000000003aaa     0 NOTYPE  GLOBAL DEFAULT    1 _binary_test_png_end
     4: 0000000000003aaa     0 NOTYPE  GLOBAL DEFAULT  ABS _binary_test_png_size
Elbert answered 3/5, 2013 at 8:27 Comment(3)
The mention of --redefine-sym is a good one, but it seems insufficient: how is the caller of objcopy supposed to know how to form the "original" symbol name? I note that if the input file to objcopy is something like ../../foo/bar.txt the symbol name is something horrible like _binary________foo_bar_txt_start. Having to encode the logic to turn dots, slashes, and perhaps other characters (which?) into underscores seems pretty silly. And bizarrely, objcopy's --wildcard option could help us, but it seems to have no effect with --redefine-sym (I suppose they intend it for other uses).Amorist
@JohnZwinck: You only need to be able to reconstruct the non-directory portion, then objdump for the names and check which ends in your desired name, and then use that for renaming.Sager
Looking at the code, all non-alphnumeric characters are converted to _. So the following converts the filename echo -n "$filename" | tr -c '[A-Za-z0-9]' '_'. Prepend _binary_ and append _start and others.Hilariohilarious
S
6

Another alternative which I have used is to cd to the source directory and then give objcopy the basename of the source. In bash, this would be:

cd $(dirname $SOURCE)
objcopy ... $(basename $SOURCE) $TARGET

This way the symbols generated are always _binary_file_name_xxx without the path.

Shortcoming answered 23/4, 2014 at 14:55 Comment(0)
Q
2

I had to do this with cmake, and I ended up using /dev/stdin as input to get consistent symbols name, then redefining the symbols thanks to string(MAKE_C_IDENTIFIER ...) And then use objcopy --redefine-sym on the resulting object file.

The resulting function is then :

function(make_binary_object __file)
    get_filename_component(__file_name ${__file} NAME)
    set(__object ${CMAKE_CURRENT_BINARY_DIR}/${__file_name}.obj)
    string(MAKE_C_IDENTIFIER ${__file_name} __file_c_identifier)
    add_custom_command(OUTPUT ${__object}
        COMMAND ${CMAKE_OBJCOPY}
            --input-format binary
            --output-format elf64-x86-64
            --binary-architecture i386:x86-64
            /dev/stdin
            ${__object} < ${__file}
        COMMAND ${CMAKE_OBJCOPY}
            --redefine-sym _binary__dev_stdin_start=_binary_${__file_c_identifier}_start
            --redefine-sym _binary__dev_stdin_end=_binary_${__file_c_identifier}_end
            --redefine-sym _binary__dev_stdin_size=_binary_${__file_c_identifier}_size
            ${__object}
        WORKING_DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR}
        DEPENDS ${__file})
    set_source_files_properties(${__object} PROPERTIES EXTERNAL_OBJECT TRUE)
endfunction()

And you can use it like this:

make_binary_object(index.html)

add_executable(my_server
    server.c
    ${CMAKE_CURRENT_BINARY_DIR}/index.html.obj)
Quezada answered 2/6, 2020 at 13:21 Comment(0)
F
-2

One simple solution is to transform your text file into what could be used to initialize an array of char. So, you'd get 0x41,0x42,0x43,0x30,0x31,0x32 for "ABC012". You can then #include this sequence of bytes. You can also escape all non-ASCII chars instead of converting everything into bytes so that most of the text is still readable in the generated include file.

Feriga answered 24/3, 2013 at 8:6 Comment(5)
Using stdin and extern avoid to store the source.Merlenemerlin
@Merlenemerlin I'm not sure I understand what you mean.Feriga
using -x<language> and - as input for gcc/g++Merlenemerlin
@Merlenemerlin How's that helpful? Sorry, I'm not following you.Feriga
this way xxd -i input.txt | sed 's/input_txt/test/' | gcc -c -xc - -o obj.oMerlenemerlin

© 2022 - 2024 — McMap. All rights reserved.