Function call in the assembly language before linking
Asked Answered
A

2

5

I was going through the assembly code generated by the compiler. I am using the C programming language and GCC compiler.

I wrote a function in C which adds two numbers by calling another function and stores the result in the variable pointed to by the pointer passed as an argument to the function.

void add_two_num(int x, int y, int * dest)
{
  int val;

  val = dummy(x, y);
  *dest = val;
}

I compiled the source code to object code (linking not done) and then disassembled the code using objdump -d

What is the meaning of the number +0x9 in the line call d <add_two_num+0x9>?
Is that useful at the stage of linking when that line will be replaced by the actual function call?

file format elf64-x86-64

0000000000000004 <add_two_num>:
   4:   53                      push   %rbx
   5:   48 89 d3                mov    %rdx,%rbx
   8:   e8 00 00 00 00          call   d <add_two_num+0x9>
   d:   89 03                   mov    %eax,(%rbx)
   f:   5b                      pop    %rbx
  10:   c3                      ret  
Afghani answered 3/6, 2022 at 13:44 Comment(0)
S
7

You are looking at an object file. This file has not been linked yet and the addresses of external functions have not been filled in yet. You can see this in the instruction encoding: the 00 00 00 00 is a dummy for the actual call target to be patched in later.

Unfortunately objdump is not smart enough to know about this on x86, so it disassembles as if the offset was actually 00 00 00 00, i.e. the call goes to the next instruction. This instruction is 0x9 bytes after the last label, so you see it interprets this address as add_two_num+0x9.

You can pass the -r option to objdump to have it show you relocation information. This way you know what function is actually being called. It'll look something like this:

0000000000000000 <add_two_num>:
   0:   53                      push   %rbx
   1:   48 89 d3                mov    %rdx,%rbx
   4:   e8 00 00 00 00          call   9 <add_two_num+0x9>
            5: R_X86_64_PLT32   dummy-0x4
   9:   89 03                   mov    %eax,(%rbx)
   b:   5b                      pop    %rbx
   c:   c3                      ret    
Shipp answered 3/6, 2022 at 13:56 Comment(0)
Z
3

Note the code bytes on the call line; the immediate operand is all zeros. This is clearly a placeholder for the linker.

The add_two_num+9 comes from the fact that the immediate operand on call is an offset of the call destination, relative to the end of the call instruction. So a zero operand means the call target is the next instruction after call, which happens to be the mov at offset 9 from add_two_num. The disassembler tries to do its best to interpret the meaning of the call target, and it sees that the call target is (technically) within add_two_num.

Zealous answered 3/6, 2022 at 13:56 Comment(2)
Nitpick: They are called instructions, not commands.Gauzy
Tomato, potato :)Zealous

© 2022 - 2024 — McMap. All rights reserved.