gdb add-symbol-file all sections and load address
Asked Answered
L

2

12

I'm debugging a boot loader (syslinux) with gdb and the gdb-stub of qemu. At some point the main file load a shared object ldlinux.elf.

I would like to add the symbols in gdb for that file. The command add-symbol-file seems like the way to go. However, as a relocatable file, I have to specify the memory address it has been loaded at. And here comes the problem.

Although I know the base address at which the LOAD segment has been loaded at, add-symbol-file works section-wise and want me to specify the address at which each section has been loaded.

Can I tell gdb to load all the symbols of all the sections provided that I specify the base address of the file in memory?

Does the behavior of gdb make sens? The section headers aren't used for running an ELF and are even optional. I can't see a use case where specifying the load address of the sections would be useful.


Example

Here are the program headers and section headers of the shared object.

Elf file type is DYN (Shared object file)
Entry point 0x4c60
There are 3 program headers, starting at offset 52

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD           0x000000 0x00000000 0x00000000 0x1db10 0x20bfc RWE 0x1000
  DYNAMIC        0x01d618 0x0001d618 0x0001d618 0x00098 0x00098 RW  0x4
  GNU_STACK      0x000000 0x00000000 0x00000000 0x00000 0x00000 RWE 0x10

 Section to Segment mapping:
  Segment Sections...
   00     .gnu.hash .dynsym .dynstr .rel.dyn .rel.plt .plt .text .rodata .ctors .dtors .data.rel.ro .dynamic .got .got.plt .data .bss 
   01     .dynamic 
   02     
There are 29 section headers, starting at offset 0x78618:

Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] .gnu.hash         GNU_HASH        00000094 000094 0007e0 04   A  2   0  4
  [ 2] .dynsym           DYNSYM          00000874 000874 0015c0 10   A  3   1  4
  [ 3] .dynstr           STRTAB          00001e34 001e34 0010f4 00   A  0   0  1
  [ 4] .rel.dyn          REL             00002f28 002f28 000ce8 08   A  2   0  4
  [ 5] .rel.plt          REL             00003c10 003c10 000568 08  AI  2   6  4
  [ 6] .plt              PROGBITS        00004180 004180 000ae0 04  AX  0   0 16
  [ 7] .text             PROGBITS        00004c60 004c60 013816 00  AX  0   0  4
  [ 8] .rodata           PROGBITS        00018480 018480 00462f 00   A  0   0 32
  [ 9] .ctors            INIT_ARRAY      0001cab0 01cab0 000010 00  WA  0   0  4
  [10] .dtors            FINI_ARRAY      0001cac0 01cac0 000004 00  WA  0   0  4
  [11] .data.rel.ro      PROGBITS        0001cae0 01cae0 000b38 00  WA  0   0 32
  [12] .dynamic          DYNAMIC         0001d618 01d618 000098 08  WA  3   0  4
  [13] .got              PROGBITS        0001d6b0 01d6b0 0000d0 04  WA  0   0  4
  [14] .got.plt          PROGBITS        0001d780 01d780 0002c0 04  WA  0   0  4
  [15] .data             PROGBITS        0001da40 01da40 0000d0 00  WA  0   0 32
  [16] .bss              NOBITS          0001db20 01db10 0030dc 00  WA  0   0 32
  [17] .comment          PROGBITS        00000000 01db10 000026 01  MS  0   0  1
  [18] .debug_aranges    PROGBITS        00000000 01db38 0010c0 00      0   0  8
  [19] .debug_info       PROGBITS        00000000 01ebf8 021ada 00      0   0  1
  [20] .debug_abbrev     PROGBITS        00000000 0406d2 009647 00      0   0  1
  [21] .debug_line       PROGBITS        00000000 049d19 00bd3a 00      0   0  1
  [22] .debug_frame      PROGBITS        00000000 055a54 004574 00      0   0  4
  [23] .debug_str        PROGBITS        00000000 059fc8 00538c 01  MS  0   0  1
  [24] .debug_loc        PROGBITS        00000000 05f354 01312d 00      0   0  1
  [25] .debug_ranges     PROGBITS        00000000 072481 0005d0 00      0   0  1
  [26] .shstrtab         STRTAB          00000000 072a51 000101 00      0   0  1
  [27] .symtab           SYMTAB          00000000 072b54 003530 10     28 504  4
  [28] .strtab           STRTAB          00000000 076084 002593 00      0   0  1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings)
  I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
  O (extra OS processing required) o (OS specific), p (processor specific)

If I try to load the file at the address 0x7fab000 then it will relocate the symbols so that the .text section starts at 0x7fab000.

(gdb) add-symbol-file bios/com32/elflink/ldlinux/ldlinux.elf 0x7fab000
add symbol table from file "bios/com32/elflink/ldlinux/ldlinux.elf" at
        .text_addr = 0x7fab000
(y or n) y
Reading symbols from bios/com32/elflink/ldlinux/ldlinux.elf...done.

And then all the symbols are off by 0x4c60 bytes.

Lavinalavine answered 10/10, 2015 at 1:7 Comment(0)
L
15

So, finally, I made my own command with python and the readelf tool. It's not very clean since it runs readelf in a subprocess and parse its output instead of parsing the ELF file directly, but it works (for 32 bits ELF only).

It uses the section headers to generate and run an add-symbol-file command with all the sections correctly relocated. The usage is pretty simple, you give it the elf file and the base address of the file. And since the remove-symbol-file wasn't working properly by just giving it the filename, I made a remove-symbol-file-all that generate and run the right remove-symbol-file -a address command.

(gdb) add-symbol-file-all bios/com32/elflink/ldlinux/ldlinux.elf 0x7fab000
add symbol table from file "bios/com32/elflink/ldlinux/ldlinux.elf" at
        .text_addr = 0x7fafc50
        .gnu.hash_addr = 0x7fab094
        .dynsym_addr = 0x7fab874
        .dynstr_addr = 0x7face34
        .rel.dyn_addr = 0x7fadf28
        .rel.plt_addr = 0x7faec08
        .plt_addr = 0x7faf170
        .rodata_addr = 0x7fc34e0
        .ctors_addr = 0x7fc7af0
        .dtors_addr = 0x7fc7b00
        .data.rel.ro_addr = 0x7fc7b20
        .dynamic_addr = 0x7fc8658
        .got_addr = 0x7fc86f0
        .got.plt_addr = 0x7fc87bc
        .data_addr = 0x7fc8a80
        .bss_addr = 0x7fc8b60
(gdb) remove-symbol-file-all bios/com32/elflink/ldlinux/ldlinux.elf 0x7fab000

Here is the code to be added in the .gdbinit file.

python
import subprocess
import re

def relocatesections(filename, addr):
    p = subprocess.Popen(["readelf", "-S", filename], stdout = subprocess.PIPE)

    sections = []
    textaddr = '0'
    for line in p.stdout.readlines():
        line = line.decode("utf-8").strip()
        if not line.startswith('[') or line.startswith('[Nr]'):
            continue

        line = re.sub(r' +', ' ', line)
        line = re.sub(r'\[ *(\d+)\]', '\g<1>', line)
        fieldsvalue = line.split(' ')
        fieldsname = ['number', 'name', 'type', 'addr', 'offset', 'size', 'entsize', 'flags', 'link', 'info', 'addralign']
        sec = dict(zip(fieldsname, fieldsvalue))

        if sec['number'] == '0':
            continue

        sections.append(sec)

        if sec['name'] == '.text':
            textaddr = sec['addr']

    return (textaddr, sections)


class AddSymbolFileAll(gdb.Command):
    """The right version for add-symbol-file"""

    def __init__(self):
        super(AddSymbolFileAll, self).__init__("add-symbol-file-all", gdb.COMMAND_USER)
        self.dont_repeat()

    def invoke(self, arg, from_tty):
        argv = gdb.string_to_argv(arg)
        filename = argv[0]

        if len(argv) > 1:
            offset = int(str(gdb.parse_and_eval(argv[1])), 0)
        else:
            offset = 0

        (textaddr, sections) = relocatesections(filename, offset)

        cmd = "add-symbol-file %s 0x%08x" % (filename, int(textaddr, 16) + offset)

        for s in sections:
            addr = int(s['addr'], 16)
            if s['name'] == '.text' or addr == 0:
                continue

            cmd += " -s %s 0x%08x" % (s['name'], addr + offset)

        gdb.execute(cmd)

class RemoveSymbolFileAll(gdb.Command):
    """The right version for remove-symbol-file"""

    def __init__(self):
        super(RemoveSymbolFileAll, self).__init__("remove-symbol-file-all", gdb.COMMAND_USER)
        self.dont_repeat()

    def invoke(self, arg, from_tty):
        argv = gdb.string_to_argv(arg)
        filename = argv[0]

        if len(argv) > 1:
            offset = int(str(gdb.parse_and_eval(argv[1])), 0)
        else:
            offset = 0

        (textaddr, _) = relocatesections(filename, offset)

        cmd = "remove-symbol-file -a 0x%08x" % (int(textaddr, 16) + offset)
        gdb.execute(cmd)


AddSymbolFileAll()
RemoveSymbolFileAll()
end
Lavinalavine answered 12/10, 2015 at 18:24 Comment(1)
There is a very small improvement you can make: Add -W when you call readelf because there are libraries with large section names that get truncated (e.g. /lib/x86_64-linux-gnu/libnss_compat-2.27.so -> .note.gnu.build-id).Hypocrisy
D
3

Can I tell gdb to load all the symbols of all the sections provided that I specify the base address of the file in memory?

Yes, but you need to provide the address of .text section, i.e. 0x7fab000+0x00004c60 here. I agree: it's quite annoying to have to fish out address of .text, and I wanted to fix it many times, so that e.g.

(gdb) add-symbol-file foo.so @0x7abc0000

just works. Feel free to file a feature request in GDB bugzilla.

Does the behavior of gdb make sens?

I am guessing that this is rooted in how GDB was used to debug embedded ROMs, where each section can be at arbitrary memory address.

Dignitary answered 10/10, 2015 at 2:36 Comment(7)
If I want to add the symbols of data, I'll have to add the options and addresses of .data, .rodata and .bss sections. And maybe more if I have specific things to debug. I'm writing a pythong script right now to generate the right gdb command.Lavinalavine
@Lavinalavine No: usually adding foo.so with the address of .text instead of the load address is all you need for both .text and .data.Dignitary
No, I just tested, if I don't specify the load address of the .data section, the symbols pointing to it get loaded but won't get the offset.Lavinalavine
GDB complains: A syntax error in expression, near '0x7abc000'.Switchboard
@Switchboard You are complaining about a missing feature that I wanted to fix. I have not fixed it yet, so of course GDB complains.Dignitary
@EmployedRussian did you ever add this feature?Hooch
instead of calling readelf via a subprocess, you can use pythons' pyelftool package for parsing elf sectionsAzar

© 2022 - 2024 — McMap. All rights reserved.