How to disassemble a memory range with GDB?
Asked Answered
L

11

77

I'm trying to disassemble a program to see a syscall assembly instruction (the INT instruction, I believe) and the handler with GDB and have written a little program (see below) for it that opens and closes a file.

I was able to follow the call to fopen with GDB until it executed a call.

When I tried to tell GDB "disassemble 0x...." (address of call) it responded with 'No function contains specified address.'

Is it possible to force GDB to disassemble (or display it in assembler as good as possible) that memory address? If so, how?

#include <stdio.h>
#include <stdlib.h>

int main() {
    FILE* f;
    f = fopen("main.c", "r");
    if (!f) { 
      perror("open");
      return -1;
    }
    fclose(f);
    return 0;
}
Legacy answered 6/8, 2009 at 7:54 Comment(4)
fopen() is not a system call, it's a call to the C standard library. And why do you think a system call must be made via an INT instruction?Tarsuss
I may be wrong, but we were taught that fopen calls ultimately result in a system call to the kernel to open the file and return a file descriptor?Legacy
Patrick: Yes, but does not need to do that directy. Normally it calls libc function which then enters kernel. But calling kernel may be done not only with int (this is slow) but with syscall/sysenter depending on processor architecture...Brian
kexik - thank you for the information. I saw that Wikipedia mentions this in its system call article (en.wikipedia.org/w/…). Apparently Linux started using the special calls in 2.5 kernels. Another thing learned about my Operating System's architecture.Legacy
R
56

Do you only want to disassemble your actual main? If so try this:

(gdb) info line main 
(gdb) disas STARTADDRESS ENDADDRESS

Like so:

USER@MACHINE /cygdrive/c/prog/dsa
$ gcc-3.exe -g main.c

USER@MACHINE /cygdrive/c/prog/dsa
$ gdb a.exe
GNU gdb 6.8.0.20080328-cvs (cygwin-special)
...
(gdb) info line main
Line 3 of "main.c" starts at address 0x401050 <main> and ends at 0x401075 <main+
(gdb) disas 0x401050 0x401075
Dump of assembler code from 0x401050 to 0x401075:
0x00401050 <main+0>:    push   %ebp
0x00401051 <main+1>:    mov    %esp,%ebp
0x00401053 <main+3>:    sub    $0x18,%esp
0x00401056 <main+6>:    and    $0xfffffff0,%esp
0x00401059 <main+9>:    mov    $0x0,%eax
0x0040105e <main+14>:   add    $0xf,%eax
0x00401061 <main+17>:   add    $0xf,%eax
0x00401064 <main+20>:   shr    $0x4,%eax
0x00401067 <main+23>:   shl    $0x4,%eax
0x0040106a <main+26>:   mov    %eax,-0xc(%ebp)
0x0040106d <main+29>:   mov    -0xc(%ebp),%eax
0x00401070 <main+32>:   call   0x4010c4 <_alloca>
End of assembler dump.

I don't see your system interrupt call however. (its been a while since I last tried to make a system call in assembly. INT 21h though, last I recall

Rinaldo answered 7/8, 2009 at 16:17 Comment(5)
ok, then I'll try to look for INT 21h in the future. Thanks for that hint. But what I wanted to try is to follow the call sequence originating in fopen() (don't see it in your code...) 'down' until I can see the INT command.Legacy
Managed it - The way to go is to use both your answer and Falaina's. I had to compile it statically with gcc --static main.c and then use gdb/objdump to go deep down into the C library. Ultimaltively, it resulted in a call to __open_nocancel, which did an INT 0x80. Thanks to both of youLegacy
Note: the disas 0x401050 0x401075 syntax at least in version gdb 7.7 won't work. You have rather write it like disas 0x401050,0x401075. Also probably you might want to add the prefix «/m» to show a source code around: disas \m 0x401050,0x401075Underground
@Patrick, Although this was quite some time ago, it's worth noting that INT 0x80 is just how Linux does it. That is to say, Linux's syscall handler is registered at interrupt 128. Other operating systems may vary -- which they do.Qualify
Please update your answer - it fails with an error in my version of gdb because of a missing comma between STARTADDRESS and ENDADDRESSOrometer
G
129

Yeah, disassemble is not the best command to use here. The command you want is "x/i" (examine as instructions):

(gdb) x/i 0xdeadbeef
Giraffe answered 9/10, 2009 at 19:12 Comment(8)
THANKS! Adding this text to help others find this hint: this is the instruction to be used to disassemble binary blob, disassemble ROM, examine instruction in a binary image file etc. Write a small C program to fread() the binary blob into a buffer. Then do 'x /i' on the buffer.Lordsandladies
@Lordsandladies if you want to disassemble a binary blob, an easier way to do it is to use a standalone disassembler like ndisasm or similar.Snips
You can use: x/i $pc to get the instruction for the pc which is the address of current instructionBrian
You can use: "(gdb) x/<number>i 0xaddress" to print <number> of instructions e.x. "(gdb) x/10i 0xaddress" to print 10 instructionsRey
Is the address 0xdeadbeef of any significance here? I mean to ask that is it any special address?Sitnik
@shane: find more info about it here en.wikipedia.org/wiki/HexspeakBodycheck
Thanks! This is perfect for debugging JIT!Admittedly
From some reason, x/i SomeAddress shows "(bad)" in GDB. The instruction at this address should be a far jump. What might be the reason for GDB not recognizing it properly as this instruction?Pentathlon
R
56

Do you only want to disassemble your actual main? If so try this:

(gdb) info line main 
(gdb) disas STARTADDRESS ENDADDRESS

Like so:

USER@MACHINE /cygdrive/c/prog/dsa
$ gcc-3.exe -g main.c

USER@MACHINE /cygdrive/c/prog/dsa
$ gdb a.exe
GNU gdb 6.8.0.20080328-cvs (cygwin-special)
...
(gdb) info line main
Line 3 of "main.c" starts at address 0x401050 <main> and ends at 0x401075 <main+
(gdb) disas 0x401050 0x401075
Dump of assembler code from 0x401050 to 0x401075:
0x00401050 <main+0>:    push   %ebp
0x00401051 <main+1>:    mov    %esp,%ebp
0x00401053 <main+3>:    sub    $0x18,%esp
0x00401056 <main+6>:    and    $0xfffffff0,%esp
0x00401059 <main+9>:    mov    $0x0,%eax
0x0040105e <main+14>:   add    $0xf,%eax
0x00401061 <main+17>:   add    $0xf,%eax
0x00401064 <main+20>:   shr    $0x4,%eax
0x00401067 <main+23>:   shl    $0x4,%eax
0x0040106a <main+26>:   mov    %eax,-0xc(%ebp)
0x0040106d <main+29>:   mov    -0xc(%ebp),%eax
0x00401070 <main+32>:   call   0x4010c4 <_alloca>
End of assembler dump.

I don't see your system interrupt call however. (its been a while since I last tried to make a system call in assembly. INT 21h though, last I recall

Rinaldo answered 7/8, 2009 at 16:17 Comment(5)
ok, then I'll try to look for INT 21h in the future. Thanks for that hint. But what I wanted to try is to follow the call sequence originating in fopen() (don't see it in your code...) 'down' until I can see the INT command.Legacy
Managed it - The way to go is to use both your answer and Falaina's. I had to compile it statically with gcc --static main.c and then use gdb/objdump to go deep down into the C library. Ultimaltively, it resulted in a call to __open_nocancel, which did an INT 0x80. Thanks to both of youLegacy
Note: the disas 0x401050 0x401075 syntax at least in version gdb 7.7 won't work. You have rather write it like disas 0x401050,0x401075. Also probably you might want to add the prefix «/m» to show a source code around: disas \m 0x401050,0x401075Underground
@Patrick, Although this was quite some time ago, it's worth noting that INT 0x80 is just how Linux does it. That is to say, Linux's syscall handler is registered at interrupt 128. Other operating systems may vary -- which they do.Qualify
Please update your answer - it fails with an error in my version of gdb because of a missing comma between STARTADDRESS and ENDADDRESSOrometer
M
34

This isn't the direct answer to your question, but since you seem to just want to disassemble the binary, perhaps you could just use objdump:

objdump -d program

This should give you its dissassembly. You can add -S if you want it source-annotated.

Maze answered 7/8, 2009 at 16:25 Comment(2)
⁺¹ for -S, I didn't knew it could include the source code.Underground
This doesn't work if the program is compiled for a different architecture :(Musgrave
P
8

You can force gcc to output directly to assembly code by adding the -S switch

gcc -S hello.c
Paulownia answered 18/9, 2009 at 17:44 Comment(0)
A
7

fopen() is a C library function and so you won't see any syscall instructions in your code, just a regular function call. At some point, it does call open(2), but it does that via a trampoline. There is simply a jump to the VDSO page, which is provided by the kernel to every process. The VDSO then provides code to make the system call. On modern processors, the SYSCALL or SYSENTER instructions will be used, but you can also use INT 80h on x86 processors.

Astto answered 25/1, 2010 at 2:22 Comment(0)
B
3

If all that you want is to see the disassembly with the INTC call, use objdump -d as someone mentioned but use the -static option when compiling. Otherwise the fopen function is not compiled into the elf and is linked at runtime.

Brig answered 22/10, 2009 at 16:51 Comment(0)
I
3

gdb disassemble has a /m to include source code alongside the instructions. This is equivalent of objdump -S, with the extra benefit of confining to just the one function (or address-range) of interest.

Insipid answered 26/2, 2015 at 19:12 Comment(0)
M
3

The accepted is not really correct. It does work in some circumstances.

 (gdb) disas STARTADDRESS ENDADDRESS

The highest upvoted answer is correct. Read no further is you don't wish to understand why it is correct.

 (gdb) x/i 0xdeadbeef

With an appropriately meaningless hex address.


I have an STM32 and I have relocated the code with PIC. The normal boot address is 0x8000000, with a 0x200 vector table. So a normal entry is 0x8000200. However, I have programmed the binary to 0x80040200 (two NOR flash sectors away) and wish to debug there.

The issue gdb has with this is 'file foo.elf' is showing that code is in the first range. Special command like 'disassemble' will actually look at the binary on the host. For the cross debug case, gdb would have to look at memory on the remote which could be expensive. So, it appears that the 'x /i' (examine as code) is the best option. The debug information that gdb depends on (where routines start/end) is not present in a random binary chunk.


To combine the answers above for PIC code on an embedded cross system,

You need to create multiple elf files, one for each possible target location. Use the GDB's file command to select the one with proper symbol locations.


This will NOT work for Cross development

You can use generating gcc debug symbols. The steps are,

  1. Build normal link address.
  2. Extract symbols.
  3. Use symbol-file with an offset for the runtime address.
  (gdb) help symbol-file
  Load symbol table from executable file FILE.
  Usage: symbol-file [-readnow | -readnever] [-o OFF] FILE
  OFF is an optional offset which is added to each section address.

You can then switch symbol files to a relocated run address to use the first answer.


If you have a case where the code is relocated, but data is absolute, you need to link twice and choose the relocated elf files (symbols only are relocated and code is the same). This is desirable with NOR flash that is XIP (execute-in-place) as the memory devices for .text and .rodata are different from .data and .bss. Ie, many lower-to-middle scale embedded devices. However, gcc does not support this code generation option (at least on ARM). You must use a 'static base' register (for example, r9 as u-boot does).

Marinemarinelli answered 15/8, 2022 at 22:55 Comment(3)
symbol-file does not work for cross debug as gdb thinks the machine is the host type. You need to load two elf files and load the 2nd address elf to debug PIC code remotely. You can compare extracted 'bin' files to see where absolute addresses remain and ensure they are fixed up.Marinemarinelli
Overlays are an inversion, here you must select the symbols that gdb believes are in an address range. x /i will always work, but the addresses will not be decoded and difficult to understand.Marinemarinelli
My main point is that 'x/i hexaddress' always works, but it gives no symbol information. It is raw disassembly only. The answer is hints to get the first form to work, so that you actually have symbolic constants in the disassembly.Marinemarinelli
Z
1

You don't have to use gdb. GCC will do it.

 gcc -S foo.c

This will create foo.s which is the assembly.

gcc -m32 -c -g -Wa,-a,-ad foo.c > foo.lst

The above version will create a listing file that has both the C and the assembly generated by it. GCC FAQ

Zendah answered 23/7, 2014 at 3:43 Comment(0)
C
1

full example for disassembling a memory range to C

/opt/gcc-arm-none-eabi-9-2019-q4-major/bin/arm-none-eabi-gdb

(gdb)file /root/ncs/zephyr/samples/hello_world/build_nrf9160dk_nrf9160ns/zephyr/zephyr.elf
(gdb) directory /root/ncs/zephyr/samples/hello_world/src
#here you want 1
(gdb) info line* 0x000328C0
#here you want 2, -0x04 ~ +0x04 is your range size
(gdb) disassemble /m 0x000328C0-0x04, 0x000328C0+0x04
#here with binary code
(gdb) disassemble /r 0x000328C0-0x04, 0x000328C0+0x04
(gdb) info thread
(gdb) interpreter-exec mi -thread-info
Confect answered 17/9, 2021 at 7:39 Comment(0)
S
0

There is another way which I wanted to presetn using gdb on top of the suggestions above: Launch your program with gdb, and set a break point on main break *main and run The you can use info proc mappings.

Suspensor answered 30/9, 2022 at 5:43 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.