Understanding the auipc+jalr sequence used for function calls

$ objdump -d test.o test.o: file format elf64-littleriscv Disassembly of section .text: 0000000000000000 <id>: 0: 00008067 ret 0000000000000004 <add_one>: 4: ff010113 addi sp,sp,-16 8: 00113423 sd ra,8(sp) c: 00000317 auipc t1,0x0 10: 000300e7 jalr t1 14: 00813083 ld ra,8(sp) 18: 00150513 addi a0,a0,1 1c: 01010113 addi sp,sp,16 20: 00008067 ret

When disassembling an object file, the displayed address information in auipc/jalr is kind of arbitrary because it's get relocated by the linker, anyways.

You can see that when also dumping the relocation information (add -r to your objdump call):

0000000000000000 <id>:
   0:   8082                    ret
0000000000000002 <add_one>:
   2:   1141                    addi    sp,sp,-16
   4:   e406                    sd  ra,8(sp)
   6:   00000097            auipc   ra,0x0
            6: R_RISCV_CALL id
            6: R_RISCV_RELAX    *ABS*
   a:   000080e7            jalr    ra # 6 <add_one+0x4>
   e:   60a2                    ld  ra,8(sp)
  10:   0505                    addi    a0,a0,1
  12:   0141                    addi    sp,sp,16
  14:   8082                    ret

Those relocation entries tell the linker to relocate the jump instructions in a relaxed fashion (the default for the RISC-V toolchain). That means it's allowed to replace auipc+jalr pairs with just one jal instruction iff the distance to the target address is short enough. Such replacements are advantageous because it saves instructions, i.e. the resulting program is shorter. Obviously, it complicates the relocation procedure a bit, because the offsets of following jump instructions need to be adjusted accordingly.

(This can be disabled with the -mno-relax GCC flag.)

Why can't the assembler directly emit final auipc/jalr/jal instructions for symbols local to the translation unit that don't need to be relocated? After all, those jumps are pc-relative.

In general it can't because with just the local view of one translation unit 1) a relaxed relocation to an external symbol may change all following offsets to internal symbols and 2) the linker might even apply some advanced rule, e.g. where an internal symbol is overlayed by an external one, such that it really has to be relocated in the linker. Or, another example, where the linker deletes a symbol.

If you want to look at relocated addresses/offsets you have to disassemble the linked binary, e.g.:

000000000001015c <id>:
   1015c:   8082                    ret
000000000001015e <add_one>:
   1015e:   1141                    addi    sp,sp,-16
   10160:   e406                    sd  ra,8(sp)
   10162:   ffbff0ef            jal ra,1015c <id>
   10166:   60a2                    ld  ra,8(sp)
   10168:   0505                    addi    a0,a0,1
   1016a:   0141                    addi    sp,sp,16
   1016c:   8082                    ret

As expected, the linker relaxes auipc+jalr to just jal. Unfortunately, objdump doesn't display the raw jal offset - 1015c is the absolute address after adding the offset to 10162.¹

You can verify it by decoding the binary instruction in the second column by yourself:

   0xffbff0ef
=  0b11111111101111111111000011101111 | split into the offset parts
=>   1 1111111101 1 11111111          | i.e. off[20], off[10:1], off[11], off[19:12]
                                      | merge them into off[20:1]
=> 0b11111111111111111101             | left-shift by 1
=> 0b111111111111111111010            | sign-extend
=> 0b11111111111111111111111111111010
=  -6
=> 0x10162 - 6
=  0x1015c

Which matches the objdump output.

¹ That means GNU binutils objdump doesn't display the raw jal offset. In contrast, llvm-objdump (LLVM 9 introduces official RISC-V support) does display the raw offset:

000000000001015e add_one:
   1015e: 41 11                         addi    sp, sp, -16
   10160: 06 e4                         sd  ra, 8(sp)
   10162: ef f0 bf ff                   jal -6
   10166: a2 60                         ld  ra, 8(sp)
   10168: 05 05                         addi    a0, a0, 1
   1016a: 41 01                         addi    sp, sp, 16
   1016c: 82 80                         ret

However, in contrast to GNU binutils objdump, llvm-objdump doesn't include the resulting absolute address as an annotation. Neither does it annotate the corresponding symbol. Thus, the GNU binutils objdump output arguably is more useful, in general.

Recommended topics

Hot tags