I was told to use a disassembler. Does gcc
have anything built in? What is the easiest way to do this?
I don't think gcc
has a flag for it, since it's primarily a compiler, but another of the GNU development tools does. objdump
takes a -d
/--disassemble
flag:
$ objdump -d /path/to/binary
The disassembly looks like this:
080483b4 <main>:
80483b4: 8d 4c 24 04 lea 0x4(%esp),%ecx
80483b8: 83 e4 f0 and $0xfffffff0,%esp
80483bb: ff 71 fc pushl -0x4(%ecx)
80483be: 55 push %ebp
80483bf: 89 e5 mov %esp,%ebp
80483c1: 51 push %ecx
80483c2: b8 00 00 00 00 mov $0x0,%eax
80483c7: 59 pop %ecx
80483c8: 5d pop %ebp
80483c9: 8d 61 fc lea -0x4(%ecx),%esp
80483cc: c3 ret
80483cd: 90 nop
80483ce: 90 nop
80483cf: 90 nop
objdump -Mintel -d
. Or Agner Fog's objconv disassembler is the nicest one I've tried yet (see my answer). Adding numbered labels to branch-targets is really really nice. –
Crabber objdump -drwC -Mintel
. -r
shows relocations from the symbol table. -C
demangles C++ names. -W
avoids line wrapping for long instructions. If you use it often, this is handy: alias disas='objdump -drwC -Mintel'
. –
Crabber -S
to display source code intermixed with disassembly. (As pointed in another answer.) –
Labana .intel_syntax noprefix
code that's ready to re-assemble; machine code hex only in comments. It doesn't support AT&T syntax, but it can produce a .s
that's ready to assemble with GNU tools. (IDK if it works around the problems GAS .intel_syntax noprefix
has with symbol names from int offset
and int eax
by putting them in quotes.) –
Crabber An interesting alternative to objdump is gdb. You don't have to run the binary or have debuginfo.
$ gdb -q ./a.out
Reading symbols from ./a.out...(no debugging symbols found)...done.
(gdb) info functions
All defined functions:
Non-debugging symbols:
0x00000000004003a8 _init
0x00000000004003e0 __libc_start_main@plt
0x00000000004003f0 __gmon_start__@plt
0x0000000000400400 _start
0x0000000000400430 deregister_tm_clones
0x0000000000400460 register_tm_clones
0x00000000004004a0 __do_global_dtors_aux
0x00000000004004c0 frame_dummy
0x00000000004004f0 fce
0x00000000004004fb main
0x0000000000400510 __libc_csu_init
0x0000000000400580 __libc_csu_fini
0x0000000000400584 _fini
(gdb) disassemble main
Dump of assembler code for function main:
0x00000000004004fb <+0>: push %rbp
0x00000000004004fc <+1>: mov %rsp,%rbp
0x00000000004004ff <+4>: sub $0x10,%rsp
0x0000000000400503 <+8>: callq 0x4004f0 <fce>
0x0000000000400508 <+13>: mov %eax,-0x4(%rbp)
0x000000000040050b <+16>: mov -0x4(%rbp),%eax
0x000000000040050e <+19>: leaveq
0x000000000040050f <+20>: retq
End of assembler dump.
(gdb) disassemble fce
Dump of assembler code for function fce:
0x00000000004004f0 <+0>: push %rbp
0x00000000004004f1 <+1>: mov %rsp,%rbp
0x00000000004004f4 <+4>: mov $0x2a,%eax
0x00000000004004f9 <+9>: pop %rbp
0x00000000004004fa <+10>: retq
End of assembler dump.
(gdb)
With full debugging info it's even better.
(gdb) disassemble /m main
Dump of assembler code for function main:
9 {
0x00000000004004fb <+0>: push %rbp
0x00000000004004fc <+1>: mov %rsp,%rbp
0x00000000004004ff <+4>: sub $0x10,%rsp
10 int x = fce ();
0x0000000000400503 <+8>: callq 0x4004f0 <fce>
0x0000000000400508 <+13>: mov %eax,-0x4(%rbp)
11 return x;
0x000000000040050b <+16>: mov -0x4(%rbp),%eax
12 }
0x000000000040050e <+19>: leaveq
0x000000000040050f <+20>: retq
End of assembler dump.
(gdb)
objdump has a similar option (-S)
This answer is specific to x86. Portable tools that can disassemble AArch64, MIPS, or whatever machine code include objdump
and llvm-objdump
.
Agner Fog's disassembler, objconv
, is quite nice. It will add comments to the disassembly output for performance problems (like the dreaded LCP stall from instructions with 16bit immediate constants, for example).
objconv -fyasm a.out /dev/stdout | less
(It doesn't recognize -
as shorthand for stdout, and defaults to outputting to a file of similar name to the input file, with .asm
tacked on.)
It also adds branch targets to the code. Other disassemblers usually disassemble jump instructions with just a numeric destination, and don't put any marker at a branch target to help you find the top of loops and so on.
It also indicates NOPs more clearly than other disassemblers (making it clear when there's padding, rather than disassembling it as just another instruction.)
It's open source, and easy to compile for Linux. It can disassemble into NASM, YASM, MASM, or GNU (AT&T) syntax.
Sample output:
; Filling space: 0FH
; Filler type: Multi-byte NOP
; db 0FH, 1FH, 44H, 00H, 00H, 66H, 2EH, 0FH
; db 1FH, 84H, 00H, 00H, 00H, 00H, 00H
ALIGN 16
foo: ; Function begin
cmp rdi, 1 ; 00400620 _ 48: 83. FF, 01
jbe ?_026 ; 00400624 _ 0F 86, 00000084
mov r11d, 1 ; 0040062A _ 41: BB, 00000001
?_020: mov r8, r11 ; 00400630 _ 4D: 89. D8
imul r8, r11 ; 00400633 _ 4D: 0F AF. C3
add r8, rdi ; 00400637 _ 49: 01. F8
cmp r8, 3 ; 0040063A _ 49: 83. F8, 03
jbe ?_029 ; 0040063E _ 0F 86, 00000097
mov esi, 1 ; 00400644 _ BE, 00000001
; Filling space: 7H
; Filler type: Multi-byte NOP
; db 0FH, 1FH, 80H, 00H, 00H, 00H, 00H
ALIGN 8
?_021: add rsi, rsi ; 00400650 _ 48: 01. F6
mov rax, rsi ; 00400653 _ 48: 89. F0
imul rax, rsi ; 00400656 _ 48: 0F AF. C6
shl rax, 2 ; 0040065A _ 48: C1. E0, 02
cmp r8, rax ; 0040065E _ 49: 39. C0
jnc ?_021 ; 00400661 _ 73, ED
lea rcx, [rsi+rsi] ; 00400663 _ 48: 8D. 0C 36
...
Note that this output is ready to be assembled back into an object file, so you can tweak the code at the asm source level, rather than with a hex-editor on the machine code. (So you aren't limited to keeping things the same size.) With no changes, the result should be near-identical. It might not be, though, since disassembly of stuff like
(from /lib/x86_64-linux-gnu/libc.so.6)
SECTION .plt align=16 execute ; section number 11, code
?_00001:; Local function
push qword [rel ?_37996] ; 0001F420 _ FF. 35, 003A4BE2(rel)
jmp near [rel ?_37997] ; 0001F426 _ FF. 25, 003A4BE4(rel)
...
ALIGN 8
?_00002:jmp near [rel ?_37998] ; 0001F430 _ FF. 25, 003A4BE2(rel)
; Note: Immediate operand could be made smaller by sign extension
push 11 ; 0001F436 _ 68, 0000000B
; Note: Immediate operand could be made smaller by sign extension
jmp ?_00001 ; 0001F43B _ E9, FFFFFFE0
doesn't have anything in the source to make sure it assembles to the longer encoding that leaves room for relocations to rewrite it with a 32bit offset.
If you don't want to install it objconv, GNU binutils objdump -drwC -Mintel
is very usable, and will already be installed if you have a normal Linux gcc setup. I use alias disas='objdump -drwC -Mintel'
on my system. (-w
is no line-wrapping, -C
is demangle, -r
prints relocations in object files.)
llvm-objdump -d
also works, and can disassemble for a variety of architectures from a single binary. (Unlike GNU objdump
where you'd need a separate per arch, like aarch64-linux-gnu-objdump -d
.) Similarly, clang -O3 -target mips -c
or clang -O3 -target riscv32 -c
or whatever are useful to compile for architectures you're interested in, but not interested enough to bother installing a cross-compiler. (https://godbolt.org/ Compiler Explorer is also a useful resource for that; see How to remove "noise" from GCC/clang assembly output? for more about it and writing small functions that compile to interesting asm.)
there's also ndisasm, which has some quirks, but can be more useful if you use nasm. I agree with Michael Mrozek that objdump is probably best.
[later] you might also want to check out Albert van der Horst's ciasdis: http://home.hccnet.nl/a.w.m.van.der.horst/forthassembler.html. it can be hard to understand, but has some interesting features you won't likely find anywhere else.
Use IDA Pro and the Decompiler.
You might find ODA useful. It's a web-based disassembler that supports tons of architectures.
You can come pretty damn close (but no cigar) to generating assembly that will reassemble, if that's what you are intending to do, using this rather crude and tediously long pipeline trick (replace /bin/bash with the file you intend to disassemble and bash.S with what you intend to send the output to):
objdump --no-show-raw-insn -Matt,att-mnemonic -Dz /bin/bash | grep -v "file format" | grep -v "(bad)" | sed '1,4d' | cut -d' ' -f2- | cut -d '<' -f2 | tr -d '>' | cut -f2- | sed -e "s/of\ section/#Disassembly\ of\ section/" | grep -v "\.\.\." > bash.S
Note how long this is, however. I really wish there was a better way (or, for that matter, a disassembler capable of outputting code that an assembler will recognize), but unfortunately there isn't.
ht editor can disassemble binaries in many formats. It is similar to Hiew, but open source.
To disassemble, open a binary, then press F6 and then select elf/image.
Let's say that you have:
#include <iostream>
double foo(double x)
{
asm("# MyTag BEGIN"); // <- asm comment,
// used later to locate piece of code
double y = 2 * x + 1;
asm("# MyTag END");
return y;
}
int main()
{
std::cout << foo(2);
}
To get assembly code using gcc you can do:
g++ prog.cpp -c -S -o - -masm=intel | c++filt | grep -vE '\s+\.'
c++filt
demangles symbols
grep -vE '\s+\.'
removes some useless information
Now if you want to visualize the tagged part, simply use:
g++ prog.cpp -c -S -o - -masm=intel | c++filt | grep -vE '\s+\.' | grep "MyTag BEGIN" -A 20
With my computer I get:
# MyTag BEGIN
# 0 "" 2
#NO_APP
movsd xmm0, QWORD PTR -24[rbp]
movapd xmm1, xmm0
addsd xmm1, xmm0
addsd xmm0, xmm1
movsd QWORD PTR -8[rbp], xmm0
#APP
# 9 "poub.cpp" 1
# MyTag END
# 0 "" 2
#NO_APP
movsd xmm0, QWORD PTR -8[rbp]
pop rbp
ret
.LFE1814:
main:
.LFB1815:
push rbp
mov rbp, rsp
A more friendly approach is to use: Compiler Explorer
-O0
asm. –
Crabber Use: gcc -S ProgramName.c
Example:
#include <stdio.h>
int myFunc(int x, int y) {
char e = 'A';
printf("%c, %d, %d\n", e, x, y);
return 1;
}
int main() {
int z = myFunc(5, 7);
return 0;
}
Makes:
.file "temp.c"
.text
.section .rodata
.LC0:
.string "%c, %d, %d\n"
.text
.globl myFunc
.type myFunc, @function
myFunc:
.LFB0:
.cfi_startproc
endbr64
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
subq $32, %rsp
movl %edi, -20(%rbp)
movl %esi, -24(%rbp)
movb $65, -1(%rbp)
movsbl -1(%rbp), %eax
movl -24(%rbp), %ecx
movl -20(%rbp), %edx
movl %eax, %esi
leaq .LC0(%rip), %rax
movq %rax, %rdi
movl $0, %eax
call printf@PLT
movl $1, %eax
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size myFunc, .-myFunc
.globl main
.type main, @function
main:
.LFB1:
.cfi_startproc
endbr64
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
subq $16, %rsp
movl $7, %esi
movl $5, %edi
call myFunc
movl %eax, -4(%rbp)
movl $0, %eax
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE1:
.size main, .-main
.ident "GCC: (Ubuntu 12.3.0-1ubuntu1~23.04) 12.3.0"
.section .note.GNU-stack,"",@progbits
.section .note.gnu.property,"a"
.align 8
.long 1f - 0f
.long 4f - 1f
.long 5
0:
.string "GNU"
1:
.align 8
.long 0xc0000002
.long 3f - 2f
2:
.long 0x3
3:
.align 8
4:
.cfi
directives, and enabling optimization so the assembly code is only doing what's necessary to implement the visible behaviour of the C functions. (So you should write functions that take args and return a value computed from them to see interesting asm.) –
Crabber Use ghidra: https://ghidra-sre.org/. It is already installed on Kali Linux.
© 2022 - 2025 — McMap. All rights reserved.