How to disassemble a binary executable in Linux to get the assembly code?
Asked Answered
S

11

138

I was told to use a disassembler. Does gcc have anything built in? What is the easiest way to do this?

Sauterne answered 26/2, 2011 at 8:38 Comment(2)
And reassemble afterwards: #4310271Funicular
Related: How to remove "noise" from GCC/clang assembly output? - if you really just want to see what the compiler did, you don't always need to compile + link + disassemble.Crabber
N
207

I don't think gcc has a flag for it, since it's primarily a compiler, but another of the GNU development tools does. objdump takes a -d/--disassemble flag:

$ objdump -d /path/to/binary

The disassembly looks like this:

080483b4 <main>:
 80483b4:   8d 4c 24 04             lea    0x4(%esp),%ecx
 80483b8:   83 e4 f0                and    $0xfffffff0,%esp
 80483bb:   ff 71 fc                pushl  -0x4(%ecx)
 80483be:   55                      push   %ebp
 80483bf:   89 e5                   mov    %esp,%ebp
 80483c1:   51                      push   %ecx
 80483c2:   b8 00 00 00 00          mov    $0x0,%eax
 80483c7:   59                      pop    %ecx
 80483c8:   5d                      pop    %ebp
 80483c9:   8d 61 fc                lea    -0x4(%ecx),%esp
 80483cc:   c3                      ret    
 80483cd:   90                      nop
 80483ce:   90                      nop
 80483cf:   90                      nop
Natalianatalie answered 26/2, 2011 at 8:43 Comment(5)
For intel-syntax: objdump -Mintel -d. Or Agner Fog's objconv disassembler is the nicest one I've tried yet (see my answer). Adding numbered labels to branch-targets is really really nice.Crabber
Useful options: objdump -drwC -Mintel. -r shows relocations from the symbol table. -C demangles C++ names. -W avoids line wrapping for long instructions. If you use it often, this is handy: alias disas='objdump -drwC -Mintel'.Crabber
Add -S to display source code intermixed with disassembly. (As pointed in another answer.)Labana
can i know is there a disassembler which will output only AT&A assembly? not all the addresses, binary encodings, etc...Twobyfour
@user135142: Agner Fog's objconv can output GAS .intel_syntax noprefix code that's ready to re-assemble; machine code hex only in comments. It doesn't support AT&T syntax, but it can produce a .s that's ready to assemble with GNU tools. (IDK if it works around the problems GAS .intel_syntax noprefix has with symbol names from int offset and int eax by putting them in quotes.)Crabber
B
64

An interesting alternative to objdump is gdb. You don't have to run the binary or have debuginfo.

$ gdb -q ./a.out 
Reading symbols from ./a.out...(no debugging symbols found)...done.
(gdb) info functions 
All defined functions:

Non-debugging symbols:
0x00000000004003a8  _init
0x00000000004003e0  __libc_start_main@plt
0x00000000004003f0  __gmon_start__@plt
0x0000000000400400  _start
0x0000000000400430  deregister_tm_clones
0x0000000000400460  register_tm_clones
0x00000000004004a0  __do_global_dtors_aux
0x00000000004004c0  frame_dummy
0x00000000004004f0  fce
0x00000000004004fb  main
0x0000000000400510  __libc_csu_init
0x0000000000400580  __libc_csu_fini
0x0000000000400584  _fini
(gdb) disassemble main
Dump of assembler code for function main:
   0x00000000004004fb <+0>:     push   %rbp
   0x00000000004004fc <+1>:     mov    %rsp,%rbp
   0x00000000004004ff <+4>:     sub    $0x10,%rsp
   0x0000000000400503 <+8>:     callq  0x4004f0 <fce>
   0x0000000000400508 <+13>:    mov    %eax,-0x4(%rbp)
   0x000000000040050b <+16>:    mov    -0x4(%rbp),%eax
   0x000000000040050e <+19>:    leaveq 
   0x000000000040050f <+20>:    retq   
End of assembler dump.
(gdb) disassemble fce
Dump of assembler code for function fce:
   0x00000000004004f0 <+0>:     push   %rbp
   0x00000000004004f1 <+1>:     mov    %rsp,%rbp
   0x00000000004004f4 <+4>:     mov    $0x2a,%eax
   0x00000000004004f9 <+9>:     pop    %rbp
   0x00000000004004fa <+10>:    retq   
End of assembler dump.
(gdb)

With full debugging info it's even better.

(gdb) disassemble /m main
Dump of assembler code for function main:
9       {
   0x00000000004004fb <+0>:     push   %rbp
   0x00000000004004fc <+1>:     mov    %rsp,%rbp
   0x00000000004004ff <+4>:     sub    $0x10,%rsp

10        int x = fce ();
   0x0000000000400503 <+8>:     callq  0x4004f0 <fce>
   0x0000000000400508 <+13>:    mov    %eax,-0x4(%rbp)

11        return x;
   0x000000000040050b <+16>:    mov    -0x4(%rbp),%eax

12      }
   0x000000000040050e <+19>:    leaveq 
   0x000000000040050f <+20>:    retq   

End of assembler dump.
(gdb)

objdump has a similar option (-S)

Bulletproof answered 2/7, 2014 at 12:4 Comment(0)
C
20

This answer is specific to x86. Portable tools that can disassemble AArch64, MIPS, or whatever machine code include objdump and llvm-objdump.


Agner Fog's disassembler, objconv, is quite nice. It will add comments to the disassembly output for performance problems (like the dreaded LCP stall from instructions with 16bit immediate constants, for example).

objconv  -fyasm a.out /dev/stdout | less

(It doesn't recognize - as shorthand for stdout, and defaults to outputting to a file of similar name to the input file, with .asm tacked on.)

It also adds branch targets to the code. Other disassemblers usually disassemble jump instructions with just a numeric destination, and don't put any marker at a branch target to help you find the top of loops and so on.

It also indicates NOPs more clearly than other disassemblers (making it clear when there's padding, rather than disassembling it as just another instruction.)

It's open source, and easy to compile for Linux. It can disassemble into NASM, YASM, MASM, or GNU (AT&T) syntax.

Sample output:

; Filling space: 0FH
; Filler type: Multi-byte NOP
;       db 0FH, 1FH, 44H, 00H, 00H, 66H, 2EH, 0FH
;       db 1FH, 84H, 00H, 00H, 00H, 00H, 00H

ALIGN   16

foo:    ; Function begin
        cmp     rdi, 1                                  ; 00400620 _ 48: 83. FF, 01
        jbe     ?_026                                   ; 00400624 _ 0F 86, 00000084
        mov     r11d, 1                                 ; 0040062A _ 41: BB, 00000001
?_020:  mov     r8, r11                                 ; 00400630 _ 4D: 89. D8
        imul    r8, r11                                 ; 00400633 _ 4D: 0F AF. C3
        add     r8, rdi                                 ; 00400637 _ 49: 01. F8
        cmp     r8, 3                                   ; 0040063A _ 49: 83. F8, 03
        jbe     ?_029                                   ; 0040063E _ 0F 86, 00000097
        mov     esi, 1                                  ; 00400644 _ BE, 00000001
; Filling space: 7H
; Filler type: Multi-byte NOP
;       db 0FH, 1FH, 80H, 00H, 00H, 00H, 00H

ALIGN   8
?_021:  add     rsi, rsi                                ; 00400650 _ 48: 01. F6
        mov     rax, rsi                                ; 00400653 _ 48: 89. F0
        imul    rax, rsi                                ; 00400656 _ 48: 0F AF. C6
        shl     rax, 2                                  ; 0040065A _ 48: C1. E0, 02
        cmp     r8, rax                                 ; 0040065E _ 49: 39. C0
        jnc     ?_021                                   ; 00400661 _ 73, ED
        lea     rcx, [rsi+rsi]                          ; 00400663 _ 48: 8D. 0C 36
...

Note that this output is ready to be assembled back into an object file, so you can tweak the code at the asm source level, rather than with a hex-editor on the machine code. (So you aren't limited to keeping things the same size.) With no changes, the result should be near-identical. It might not be, though, since disassembly of stuff like

  (from /lib/x86_64-linux-gnu/libc.so.6)

SECTION .plt    align=16 execute                        ; section number 11, code

?_00001:; Local function
        push    qword [rel ?_37996]                     ; 0001F420 _ FF. 35, 003A4BE2(rel)
        jmp     near [rel ?_37997]                      ; 0001F426 _ FF. 25, 003A4BE4(rel)

...    
ALIGN   8
?_00002:jmp     near [rel ?_37998]                      ; 0001F430 _ FF. 25, 003A4BE2(rel)

; Note: Immediate operand could be made smaller by sign extension
        push    11                                      ; 0001F436 _ 68, 0000000B
; Note: Immediate operand could be made smaller by sign extension
        jmp     ?_00001                                 ; 0001F43B _ E9, FFFFFFE0

doesn't have anything in the source to make sure it assembles to the longer encoding that leaves room for relocations to rewrite it with a 32bit offset.


If you don't want to install it objconv, GNU binutils objdump -drwC -Mintel is very usable, and will already be installed if you have a normal Linux gcc setup. I use alias disas='objdump -drwC -Mintel' on my system. (-w is no line-wrapping, -C is demangle, -r prints relocations in object files.)

llvm-objdump -d also works, and can disassemble for a variety of architectures from a single binary. (Unlike GNU objdump where you'd need a separate per arch, like aarch64-linux-gnu-objdump -d.) Similarly, clang -O3 -target mips -c or clang -O3 -target riscv32 -c or whatever are useful to compile for architectures you're interested in, but not interested enough to bother installing a cross-compiler. (https://godbolt.org/ Compiler Explorer is also a useful resource for that; see How to remove "noise" from GCC/clang assembly output? for more about it and writing small functions that compile to interesting asm.)

Crabber answered 29/11, 2015 at 2:45 Comment(0)
F
6

there's also ndisasm, which has some quirks, but can be more useful if you use nasm. I agree with Michael Mrozek that objdump is probably best.

[later] you might also want to check out Albert van der Horst's ciasdis: http://home.hccnet.nl/a.w.m.van.der.horst/forthassembler.html. it can be hard to understand, but has some interesting features you won't likely find anywhere else.

Filly answered 26/2, 2011 at 9:10 Comment(1)
In particular: home.hccnet.nl/a.w.m.van.der.horst/ciasdis.html contains under "latest developments" a debian package that you can install easily. With proper instructions (it does scripting) it will generate a source file that will reassemble again to the exact same binary. I'm not aware of any package that can do that. It may be hard to use from the instructions, I intend to publish in github with extensive examples.Dominant
C
4

Use IDA Pro and the Decompiler.

Coz answered 26/2, 2011 at 8:43 Comment(4)
IDA seems a bit overkill for this, especially considering it's rather expensiveNatalianatalie
the free version is not available for Linux, only the limited demo version. (too bad because, on windows, that's the best disassembler i have ever used)Spitzer
IDA is good but the problem of IDA is you get lazy if you used for small tasks.. gdb does the job for most of everything, gdb easier? no, but possible.Barrelhouse
IDA is proprietary software, it doesn't respect the user's freedom. It contains DRM which restricts the user from using many features. Moreover, that's a paid software. See gnu.org/proprietary/proprietary.html.Geisel
M
4

You might find ODA useful. It's a web-based disassembler that supports tons of architectures.

http://onlinedisassembler.com/

Marxismleninism answered 28/3, 2014 at 3:14 Comment(1)
great idea. getting Server Error (500) to onlinedisassembler.com/odaweb - hope it's transient.Undo
U
3

You can come pretty damn close (but no cigar) to generating assembly that will reassemble, if that's what you are intending to do, using this rather crude and tediously long pipeline trick (replace /bin/bash with the file you intend to disassemble and bash.S with what you intend to send the output to):

objdump --no-show-raw-insn -Matt,att-mnemonic -Dz /bin/bash | grep -v "file format" | grep -v "(bad)" | sed '1,4d' | cut -d' ' -f2- | cut -d '<' -f2 | tr -d '>' | cut -f2- | sed -e "s/of\ section/#Disassembly\ of\ section/" | grep -v "\.\.\." > bash.S

Note how long this is, however. I really wish there was a better way (or, for that matter, a disassembler capable of outputting code that an assembler will recognize), but unfortunately there isn't.

Unprincipled answered 15/11, 2019 at 8:13 Comment(1)
Wow! This is fantastic. Btw, regarding your problem, why don't you use an alias for it to skip typing this huge command?Shawn
P
1

ht editor can disassemble binaries in many formats. It is similar to Hiew, but open source.

To disassemble, open a binary, then press F6 and then select elf/image.

Pacesetter answered 28/5, 2017 at 13:6 Comment(0)
M
1

Let's say that you have:

#include <iostream>

double foo(double x)
{
  asm("# MyTag BEGIN"); // <- asm comment,
                        //    used later to locate piece of code
  double y = 2 * x + 1;

  asm("# MyTag END");

  return y;
}

int main()
{
  std::cout << foo(2);
}

To get assembly code using gcc you can do:

 g++ prog.cpp -c -S -o - -masm=intel | c++filt | grep -vE '\s+\.'

c++filt demangles symbols

grep -vE '\s+\.' removes some useless information

Now if you want to visualize the tagged part, simply use:

g++ prog.cpp -c -S -o - -masm=intel | c++filt | grep -vE '\s+\.' | grep "MyTag BEGIN" -A 20

With my computer I get:

    # MyTag BEGIN
# 0 "" 2
#NO_APP
    movsd   xmm0, QWORD PTR -24[rbp]
    movapd  xmm1, xmm0
    addsd   xmm1, xmm0
    addsd   xmm0, xmm1
    movsd   QWORD PTR -8[rbp], xmm0
#APP
# 9 "poub.cpp" 1
    # MyTag END
# 0 "" 2
#NO_APP
    movsd   xmm0, QWORD PTR -8[rbp]
    pop rbp
    ret
.LFE1814:
main:
.LFB1815:
    push    rbp
    mov rbp, rsp

A more friendly approach is to use: Compiler Explorer

Midwest answered 15/11, 2019 at 8:55 Comment(1)
This is only reliable with optimization disabled, otherwise parts of the operations inside the region could optimize into stuff outside, or be optimized away. So you can only see the clunky -O0 asm.Crabber
T
0

Use: gcc -S ProgramName.c

Example:

#include <stdio.h>

int myFunc(int x, int y) {
    char e = 'A';
    printf("%c, %d, %d\n", e, x, y);
    return 1;
}

int main() {
    int z = myFunc(5, 7);
    return 0;
}

Makes:

    .file   "temp.c"
    .text
    .section    .rodata
.LC0:
    .string "%c, %d, %d\n"
    .text
    .globl  myFunc
    .type   myFunc, @function
myFunc:
.LFB0:
    .cfi_startproc
    endbr64
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    subq    $32, %rsp
    movl    %edi, -20(%rbp)
    movl    %esi, -24(%rbp)
    movb    $65, -1(%rbp)
    movsbl  -1(%rbp), %eax
    movl    -24(%rbp), %ecx
    movl    -20(%rbp), %edx
    movl    %eax, %esi
    leaq    .LC0(%rip), %rax
    movq    %rax, %rdi
    movl    $0, %eax
    call    printf@PLT
    movl    $1, %eax
    leave
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc
.LFE0:
    .size   myFunc, .-myFunc
    .globl  main
    .type   main, @function
main:
.LFB1:
    .cfi_startproc
    endbr64
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    subq    $16, %rsp
    movl    $7, %esi
    movl    $5, %edi
    call    myFunc
    movl    %eax, -4(%rbp)
    movl    $0, %eax
    leave
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc
.LFE1:
    .size   main, .-main
    .ident  "GCC: (Ubuntu 12.3.0-1ubuntu1~23.04) 12.3.0"
    .section    .note.GNU-stack,"",@progbits
    .section    .note.gnu.property,"a"
    .align 8
    .long   1f - 0f
    .long   4f - 1f
    .long   5
0:
    .string "GNU"
1:
    .align 8
    .long   0xc0000002
    .long   3f - 2f
2:
    .long   0x3
3:
    .align 8
4:
Transilient answered 9/3, 2024 at 5:55 Comment(1)
See How to remove "noise" from GCC/clang assembly output? re: removing / reducing noise, like the .cfi directives, and enabling optimization so the assembly code is only doing what's necessary to implement the visible behaviour of the C functions. (So you should write functions that take args and return a value computed from them to see interesting asm.)Crabber
P
-3

Use ghidra: https://ghidra-sre.org/. It is already installed on Kali Linux.

Psychogenic answered 18/4, 2022 at 10:39 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.