When disassembling an object file, the displayed address information in auipc
/jalr
is kind of arbitrary because it's get relocated by the linker, anyways.
You can see that when also dumping the relocation information (add -r
to your objdump call):
0000000000000000 <id>:
0: 8082 ret
0000000000000002 <add_one>:
2: 1141 addi sp,sp,-16
4: e406 sd ra,8(sp)
6: 00000097 auipc ra,0x0
6: R_RISCV_CALL id
6: R_RISCV_RELAX *ABS*
a: 000080e7 jalr ra # 6 <add_one+0x4>
e: 60a2 ld ra,8(sp)
10: 0505 addi a0,a0,1
12: 0141 addi sp,sp,16
14: 8082 ret
Those relocation entries tell the linker to relocate the jump instructions in a relaxed fashion (the default for the RISC-V toolchain). That means it's allowed to replace auipc
+jalr
pairs with just one jal
instruction iff the distance to the target address is short enough. Such replacements are advantageous because it saves instructions, i.e. the resulting program is shorter. Obviously, it complicates the relocation procedure a bit, because the offsets of following jump instructions need to be adjusted accordingly.
(This can be disabled with the -mno-relax
GCC flag.)
Why can't the assembler directly emit final auipc
/jalr
/jal
instructions for symbols local to the translation unit that don't need to be relocated? After all, those jumps are pc-relative.
In general it can't because with just the local view of one translation unit 1) a relaxed relocation to an external symbol may change all following offsets to internal symbols and 2) the linker might even apply some advanced rule, e.g. where an internal symbol is overlayed by an external one, such that it really has to be relocated in the linker. Or, another example, where the linker deletes a symbol.
If you want to look at relocated addresses/offsets you have to disassemble the linked binary, e.g.:
000000000001015c <id>:
1015c: 8082 ret
000000000001015e <add_one>:
1015e: 1141 addi sp,sp,-16
10160: e406 sd ra,8(sp)
10162: ffbff0ef jal ra,1015c <id>
10166: 60a2 ld ra,8(sp)
10168: 0505 addi a0,a0,1
1016a: 0141 addi sp,sp,16
1016c: 8082 ret
As expected, the linker relaxes auipc
+jalr
to just jal
. Unfortunately, objdump doesn't display the raw jal
offset - 1015c
is the absolute address after adding the offset to 10162
.1
You can verify it by decoding the binary instruction in the second column by yourself:
0xffbff0ef
= 0b11111111101111111111000011101111 | split into the offset parts
=> 1 1111111101 1 11111111 | i.e. off[20], off[10:1], off[11], off[19:12]
| merge them into off[20:1]
=> 0b11111111111111111101 | left-shift by 1
=> 0b111111111111111111010 | sign-extend
=> 0b11111111111111111111111111111010
= -6
=> 0x10162 - 6
= 0x1015c
Which matches the objdump output.
1 That means GNU binutils objdump doesn't display the raw jal
offset. In contrast, llvm-objdump
(LLVM 9 introduces official RISC-V support) does display the raw offset:
000000000001015e add_one:
1015e: 41 11 addi sp, sp, -16
10160: 06 e4 sd ra, 8(sp)
10162: ef f0 bf ff jal -6
10166: a2 60 ld ra, 8(sp)
10168: 05 05 addi a0, a0, 1
1016a: 41 01 addi sp, sp, 16
1016c: 82 80 ret
However, in contrast to GNU binutils objdump, llvm-objdump
doesn't include the resulting absolute address as an annotation. Neither does it annotate the corresponding symbol. Thus, the GNU binutils objdump output arguably is more useful, in general.
gcc -S
. – Foregutobjdump -d
); others will be resolved only at load/run time (usegdb
withstart
anddisassemble function_name
commands). Relocations are hidden from default disassembler view in objdump, useobjdump -drR
(--reloc and --dynamic-reloc options) to see them (You may also check asm outputtest.s
of compiler withgcc -O2 -fno-inline -S test.c
to see how compiler pass instructions towards linker and loader) – Choplogic