Why do x86 jump/call instructions use relative displacements instead of absolute destinations?
Asked Answered
R

1

5

I am learning 8086 and there is one particular question which is bothering me and I have not been able to find any satisfactory answer yet.

I understand that CPU executes the code sequentially and if want to change the code flow we would like the IP to point to the new/old address where the code of our interest is sitting.

Now, my question is why we(I mean CPU) don't just go and update the IP with the address corresponding to the label when we encounter jump instruction?

What is the need to have a displacement which is added to IP when we encounter jump instruction?

In my opinion

  1. calculating the displacement(i.e the distance from the jump label to the next instruction after the jump) and
  2. then taking that displacements 2's compliment,
  3. which finally gets added to the IP so that IP points to the address/instruction pointed by label

To me this sounds like more work then just updating the IP with the address corresponding to label. But, I am sure there must be a reason for the way things are done, its just that I am not aware.

What was the reason for this design choice in 8086?

Rumple answered 12/9, 2017 at 20:22 Comment(9)
Displacements allow for the code to be relocated, and in some cases the encoding for the displacement can be shorter than had it been absolute.Leonoreleonsis
@MichaelPetch Instead of displacement if we had the absolute address(address pointed by label) wouldn't it still be relocatable? I think that in the first pass assembler can easily figure out the address pointed by the label and if yes then why do we want to get into encoding and things. Why don't we just use the address and be done with it.Rumple
Code compactness mattered a great deal back in 1976, memory was very expensive back then. That is for one why all conditional jumps only take 1 byte for the jump offset, necessarily that needs to be a relative offset. It still matters today, not for size but for speed, having to encode 8 bytes for a 64-bit address would have been very rough on the processor caches. Keeping code easily relocatable matters a great deal was well today, Unix in particular likes PIC (position-independent code) for shared libraries. Having to relocate prevents code sharing.Donte
No, Albert, using the absolute address instead of displacement is exactly the opposite of relocatable. Relocatable (in the sense Michael used it) means that the code can be moved to a different location in memory without changing the code. If the absolute address is in the instruction, this doesn't work, but if a displacement from the current IP is in the instruction, the jump still goes to the right place even if the code has been moved..Hum
@Hum Thanks for explaining Relocatable; but I still have doubts. In assembly language following instrn for ex: label: mov a,0xffff .... jmp label Assembler will replace label with the first address of "mov a,0xffff" My question is when we have this address why to calculate displacement/offset. Also, doesn't having the label makes the code relocatable.Rumple
@Rumple - Listen to Hans! In your 8086 the data bus is only 16-bits wide. Having the jump and the offset fit in 16-bits means that the instruction (properly aligned) can be read in one memory cycle. Going up to 3 bytes would take 2 memory cycles just get the entire instruction loaded.Justness
Having the label makes the code relocatable at the time it is assembled/linked. Using the displacement in the instruction allows the executable code to be repositioned in memory without having to rebuild it.Hum
The CPU sees infiniteLoop: jmp infiniteLoop only as machine code eb fe, it doesn't search for label infiniteLoop and compute it is -2 bytes away during each execution. That's the work of assembler, which is producing the machine code. So the CPU just does ip = ip + sign-extended(immediate), almost the same amount of work as ip = absolute_address. The addition was even back in 197x reasonably cheap operation, fetching the new opcodes from memory took longer than that. With modern x86 the addition is almost free, but keeping all that cache machinery up to date makes jmp complex.Footton
And that 0xFE is -2 always, wherever you relocate that piece of code. While absolute address encoded in instruction would need patching with each relocation of code to point to the correct absolute address. And modern executables don't know the address where they will be loaded by OS. So they have relocation table, the OS loads binary from disk into memory, and then goes through the relocation table, and patches all instruction opcodes to have correct absolute addresses. A PIC variant of executable does use only relative addressing, so OS will just load it to random address and execute it.Footton
O
8

You are vastly over-estimating the cost in CPU complexity of decoding a relative jump.

  1. calculating the displacement(i.e the distance from the jump label to the next instruction after the jump)
  2. then taking that displacements 2's compliment,

The machine code has to contain the result of step 2 (a signed integer relative displacement), so all of that is done at assemble time. And in the assembler, subtracting two integer addresses already gives you the signed 2's complement displacement you need.

There are real advantages to using relative displacements, so making the ISA worse just to simplify writing an assembler would not have made any sense. You only need to write the assembler once, but everything that runs on the machine benefits from more compact code, and position independence.

Relative branch displacements are completely normal, and used in most other architectures, too (e.g. ARM: https://community.arm.com/processors/b/blog/posts/branch-and-call-sequences-explained, where fixed-width instructions makes a direct absolute branch encoding impossible anyway). It would have made 8086 the odd one out to not use relative branch encoding.

update: Maybe not totally the odd one out. MIPS uses rel16 << 2 for beq / bne (MIPS instructions are fixed at 32-bits wide and always aligned). But for unconditional j (jump) instructions, it interestingly it uses a pseudo-direct encoding. It keeps the high 4 bits of PC, and directly replaces the PC[27:2] bits with the value encoded in the instruction. (Again, low 2 bits of the program counter are always 0.) So within the same 1/16th of address space, j instructions are direct jumps, and don't give you position-independent code. This applies to jal (jump-and-link = call), making function calls from PIC code less efficient :( Linux-MIPS used to require PIC binaries, but apparently now it doesn't (but shared libs still have to be PIC).


When the CPU runs eb fe, all it has to do is add the displacement to IP instead of replacing IP. Since non-jump instructions already update IP by adding the instruction length, the adder hardware already exists.

Note that sign-extending 8-bit displacements to 16-bit (or 32 or 64-bit) is trivial in hardware: 2's complement sign-extension is just copying the sign bit, which doesn't require any logic gates, just wires to connect one bit to the rest. (e.g. 0xfe becomes 0xfffe, while 0x05 becomes 0x0005.)


8086 put a big emphasis on code density, providing short forms of many common instructions. This makes sense, because code-fetch was one of the most important bottlenecks on 8086, so smaller code usually was faster code.

For example, two forms of relative jmp existed, one with rel8 (short) and one with rel16 (near). (In 32 and 64-bit mode introduced in later CPUs, the E9 opcode is a jmp rel32 instead of rel16, but EB is still jmp rel8 because jumps within a function are often within -128/+127).

But there's no special short for for call, because it wouldn't be much use most of the time. So why does it still bother with a relative displacement instead of absolute?

Well x86 does have absolute jumps, but only for indirect or far jumps. (To a different code segment). For example, the EA opcode is jmp ptr16:16: "Jump far, absolute, address given in operand".

To do an absolute near jump, simply mov ax, target_label / jmp ax. (Or in MASM syntax, mov ax, OFFSET target_label).


Relative displacements are position-independent

Comments on the question brought this up.

Consider a block of machine code (already assembled), with some jumps inside the block. If you copy that whole block to a different start address (or change the CS base address so the same block is accessible at a different offset with the segment), then only relative jumps will keep working.

For labels + absolute addresses to solve the same problem, the code would have to be re-assembled with a different ORG directive. Obviously that can't happen on the fly when you change CS with a far jmp!

Overhand answered 14/9, 2017 at 6:32 Comment(3)
I was thinking that assembler does the conversion to machine code and based on that I was thinking that it is a lot to ask from assembler. Am I correct in my understanding?Rumple
@Albert: No, it's not a lot to ask from the assembler. Given the current and destination addresses as binary integers, subtracting them gives you the 2's complement displacement. (If the assembler is itself written in x86 asm / machine code, that's one extra sub instruction. Then extra instructions to check if it can use the compact rel8 encoding...) It's a very minor amount of extra software complexity, and not something you'd make the hardware worse for! Relative displacements in branches are completely normal for machine code in most architectures.Overhand
(I think 8-bit micros more often used 16-bit absolute addresses, which made some sense, using the same space as 8086's rel16. But for 8086 with segmentation, it's nice that jumps can be position-independent and work regardless of how CS:IP is addressing the current byte, as long as IP doesn't wrap.)Overhand

© 2022 - 2024 — McMap. All rights reserved.