MIPS processors uses fixed-sized size instructions, where each instruction word is, well, a word (i.e. 4 bytes == 32 bits). So there's only so much information that can be crammed into those 4 bytes.
The J
and JAL
instructions use 6 of the 32 bits to specify the opcode. This leaves 26 bits to specify the target address. The target address isn't specified directly in the instruction though (there aren't enough bits for that) - instead, what happens is this:
- The low 28 bits of the target address are shifted right 2 bits, and then the 26 least significant bits are stored in the instruction word. Since all instructions must be word-aligned the two bits that we shifted out will always be zeroes, so we don't lose any information that we can't recreate.
- When the jump occurs, those 26 bits are shifted left 2 bits to get the original 28 bits, and then they are combined with the 4 most significant bits of the address of the instruction following the
J
/JAL
to form a 32-bit address.
This makes it possible to jump to any instruction in the same 256MB-range (2^28) that the jump instruction is located in (or if delayed branching is enabled; to any instruction in the same 256MB-range as the instruction in the delay slot).
For the branch instructions there are 16 bits available to specify the target address. These are stored as signed offsets relative to the instruction following the branch instruction (again with two bits of shifting applied, because it's unnecessary to store something that we know will always be 0). So the actual offset after restoring the 2 least significant bits is 18 bits, which then is sign-extended to 32 bits and added to the address of the instruction following the branch instruction. This makes is possible to branch to +/-128kB within the branch instruction.
Consider the following code loaded at address 0x00400024:
main:
j foo
nop
foo:
b main
nop
The j foo
instruction is encoded as 0x0810000b
. The 26 least significant bits have the value 0x10000b
, which after shifting 2 bits to the left become 0x40002c
. The 4 most significant bits of the address of the instruction following j
are zero, so the target address becomes (0 << 28) | 0x40002c
, which equals 0x40002c
, which happens to be the address of foo
.
The b main
instruction is encoded as 0x0401fffd
. The 16 least significant bits have the value 0xfffd
, which after shifting 2 bits to the left becomes 0x3fff4
. Sign-extending that to 32 bits gives us 0xfffffff4
. And when adding that to the address of the instruction following the b
we get 0x400030 + 0xfffffff4
, which (when truncated to 32 bits) equals 0x400024
, which happens to be the address of main
.
If you want to jump to some arbitrary address, load the address into a register and use the jr
or jalr
instruction to jump.