If you're disassembling .o
object files that haven't been linked yet, the call address will just be a placeholder to be filled in by the linker.
You can use objdump -drwc -Mintel
to show the relocation types + symbol names from a .o
(The -r
option is the key. Or -R
for an already-linked shared library.)
It's more useful to the user to show the actual address of the jump target, rather than disassemble it as jcc eip-1234H
or something. Object files have a default load address, so the disassembler has a value for eip
at every instruction, and this is usually present in disassembly output.
e.g. in some asm code I wrote (where I use symbol names that made it into the object file, so the loop branch target is actually visible to the disassembler):
objdump -M intel -d rs-asmbench:
...
00000000004020a0 <.loop>:
4020a0: 0f b6 c2 movzx eax,dl
4020a3: 0f b6 de movzx ebx,dh
...
402166: 49 83 c3 10 add r11,0x10
40216a: 0f 85 30 ff ff ff jne 4020a0 <.loop>
0000000000402170 <.last8>:
402170: 0f b6 c2 movzx eax,dl
Note that the encoding of the jne
instruction is a signed little-endian 32bit displacement, of -0xD0
bytes. (jumps add their displacement to the value of e/rip
after the jump. The jump instruction itself is 6 bytes long, so the displacement has to be -0xD0
, not just -0xCA
.) 0x100 - 0xD0 = 0x30
, which is the value of the least-significant byte of the 2's complement displacement.
In your question, you're talking about the call addresses being 0xFFFF...
, which makes little sense unless that's just a placeholder, or you thought the non-0xFF
bytes in the displacement were part of the opcode.
Before linking, references to external symbols look like this:
objdump -M intel -d main.o
...
a5: 31 f6 xor esi,esi
a7: e8 00 00 00 00 call ac <main+0xac>
ac: 4c 63 e0 movsxd r12,eax
af: ba 00 00 00 00 mov edx,0x0
b4: 48 89 de mov rsi,rbx
b7: 44 89 f7 mov edi,r14d
ba: e8 00 00 00 00 call bf <main+0xbf>
bf: 83 f8 ff cmp eax,0xffffffff
c2: 75 cc jne 90 <main+0x90>
...
Notice how the call
instructions have their relative displacement = 0. So before the linker has slotted in the actual relative value, they encode a call
with a target of the instruction right after the call. (i.e. RIP = RIP+0
). The call bf
is immediately followed by an instruction that starts at 0xbf
from the start of the section. The other call
has a different target address because it's at a different place in the file. (gcc puts main
in its own section: .text.startup
).
So, if you want to make sense of what's actually being called, look at a linked executable, or get a disassembler that has looks at the object file symbols to slot in symbolic names for call targets instead of showing them as calls with zero displacement.
Relative jumps to local symbols already get resolved before linking:
objdump -Mintel -d asm-pinsrw.o:
0000000000000040 <.loop>:
40: 0f b6 c2 movzx eax,dl
43: 0f b6 de movzx ebx,dh
...
106: 49 83 c3 10 add r11,0x10
10a: 0f 85 30 ff ff ff jne 40 <.loop>
0000000000000110 <.last8>:
110: 0f b6 c2 movzx eax,dl
Note, the exact same instruction encoding on the relative jump to a symbol in the same file, even though the file has no base address, so the disassembler just treats it as zero.
See Intel's reference manual for instruction encoding. Links at https://stackoverflow.com/tags/x86/info. Even in 64bit mode, call
only supports 32bit sign-extended relative offsets. 64bit addresses are supported as absolute. (In 32bit mode, 16bit relative addresses are supported, with an operand-size prefix, I guess saving one instruction byte.)
call _foo
orcall 0x12345
) and the assembler generates the appropriate machine language encoding. Disassemblers reverse this process. The fact that the code wasn't actually generated by an assembler doesn't change how disassemblers work. – Wallinga