I understand that in a typical ELF binary, functions get called through the Procedure Linkage Table (PLT). The PLT entry for a function usually contains a jump to a Global Offset Table (GOT) entry. This entry will first reference some code to load the actual function address into the GOT, and contain the actual function address after the first call (lazy binding).
To be precise, before lazy binding the GOT entry points back into the PLT, to the instructions following the jump into the GOT. These instructions will usually jump to the head of the PLT, from where some binding routine gets called which will then update the GOT entry.
Now I'm wondering why there are two indirections (calling into the PLT and then jumping to an address from the GOT), instead of just sparing the PLT and calling the address from the GOT directly. It looks like this could save a jump and the complete PLT. You would of course still need some code calling the binding routine, but this can be outside the PLT.
Is there anything I am missing? What is/was the purpose of an extra PLT?
Update: As suggested in the comments, I created some (pseudo-) code ASCII art to further explain what I'm referring to:
This is the situation, as far as I understand it, in the current PLT scheme before lazy binding: (Some indirections between the PLT and printf
are represented by "...".)
Program PLT printf
+---------------+ +------------------+ +-----+
| ... | | push [0x603008] |<---+ +-->| ... |
| call j_printf |--+ | jmp [0x603010] |----+--...--+ +-----+
| ... | | | ... | |
+---------------+ +-->| jmp [printf@GOT] |-+ |
| push 0xf |<+ |
| jmp 0x400da0 |----+
| ... |
+------------------+
… and after lazy binding:
Program PLT printf
+---------------+ +------------------+ +-----+
| ... | | push [0x603008] | +-->| ... |
| call j_printf |--+ | jmp [0x603010] | | +-----+
| ... | | | ... | |
+---------------+ +-->| jmp [printf@GOT] |--+
| push 0xf |
| jmp 0x400da0 |
| ... |
+------------------+
In my imaginary alternative scheme without a PLT, the situation before lazy binding would look like this: (I kept the code in the "Lazy Binding Table" similar to to the one from the PLT. It could also look differently, I don't care.)
Program Lazy Binding Table printf
+-------------------+ +------------------+ +-----+
| ... | | push [0x603008] |<-+ +-->| ... |
| call [printf@GOT] |--+ | jmp [0x603010] |--+--...--+ +-----+
| ... | | | ... | |
+-------------------+ +-->| push 0xf | |
| jmp 0x400da0 |--+
| ... |
+------------------+
Now after the lazy binding, one wouldn't use the table anymore:
Program Lazy Binding Table printf
+-------------------+ +------------------+ +-----+
| ... | | push [0x603008] | +-->| ... |
| call [printf@GOT] |--+ | jmp [0x603010] | | +-----+
| ... | | | ... | |
+-------------------+ | | push 0xf | |
| | jmp 0x400da0 | |
| | ... | |
| +------------------+ |
+------------------------+
call myfunc@GOTPCREL[rip]
intocall myfunc
if it does findmyfunc
is available directly to be linked into the same library. (And IIRC it uses a segment override prefix to pad thecall rel32
to fill the 6-byte slot). – Basilbasilar