They all do the same thing, i.e. nothing.
As you say, logically, sign- or zero-extending a value to a width larger than the operand size should not actually affect the value used, and that's correct. You can confirm it with a careful reading of the pseudocode in the Architecture Reference Manual. In the code for ExtendReg
, note the line len = Min(len, N - shift)
. Here N
is 32, so it makes no difference whether len
is 32 or 64.
Similarly, uxtx
and sxtx
are both no-ops for either 32-bit or 64-bit instructions.
So the following instructions all have exactly the same architectural effect, performing the operation w0 = w1 + (w2 << 3)
. I actually tested them with a selection of chosen and random inputs, verifying that the results and flags are identical for all five.
0: 2b224c20 adds w0, w1, w2, uxtw #3
4: 2b22cc20 adds w0, w1, w2, sxtw #3
8: 2b226c20 adds w0, w1, w2, uxtx #3
c: 2b22ec20 adds w0, w1, w2, sxtx #3
10: 2b020c20 adds w0, w1, w2, lsl #3
However, note that their encodings are different.
And that is also why they use different mnemonics for the extension operation: one of the principles of the ARM64 assembly language is that every legal binary encoding should have its own unambiguous assembly. So if for some obscure reason you care whether you get the encoding 0x2b224c20
or 0x2b226c20
-- say you are trying to write shellcode where certain bytes are forbidden -- you can specify uxtw
or uxtx
to select the one you want. This also means that if you disassemble and reassemble a section of code, you will get back the identical binary that you put in.
(Contrast the situation in x86 assembly language, where redundant encodings do not get distinct mnemonics. So add edx, ecx
may assemble to either 01 ca
(the "store form") or 03 d1
("load form"), and assemblers often don't give you any way to pick which one. Likewise both encodings will disassemble to add edx, ecx
, so if you disassemble and reassemble you may not end up with the same binary you started with. See How to resolve ambivalence in x64 assembly? and its duplicate links.)
The mnemonics for the extension operators reflect the encoding structure, which also helps to explain why the redundant encodings exist in the first place. The extension type is encoded in a 3-bit "option" field, bits 13-15 of the instruction. Bits 13-14 specify the width of the value to be extended:
00
= 8-bit byte B
01
= 16-bit halfword H
10
= 32-bit word W
11
= 64-bit doubleword X
Note that X
is always effectively "no extension". Then bit 15 specifies the signedness: 0 = unsigned U
, 1 = signed S
. So 010 = uxtw
and 011 = uxtx
since that is what they logically specify, even though for a 32-bit operation, both have the same actual effect (i.e. none).
This might seem like a waste of the instruction space, but presumably it allows the decoder hardware to be simpler than if the otherwise redundant encodings were to select some different operation.
The last option listed above, adds w0, w1, w2, lsl #3
has a different encoding altogether because it selects the "Add (shifted register)" opcode, instead of the "Add (extended register)" opcode as the first four do. So this is another redundancy; an add without extension, with a left shift of 0-4 bits, can be done with either opcode. However, this is not entirely useless, because the "extended register" form can use the stack pointer register sp
as an operand, while the "shifted register" can use the zero register xzr/wzr
. Both registers are encoded as "register 31", so each opcode has to specify whether it interprets "register 31" as the stack pointer or as the zero register. So the fact that the two opcodes have overlapping effect lets the instruction set provide addition using either the stack pointer or the zero register, where otherwise only one or the other could be supported.
The sxt/uxt
syntax shows up in a couple other places in the ARM64 assembly language, with slightly different details in each case.
The sxt*/uxt*
instructions, which simply sign- or zero-extend one register into another. They are aliases for special cases of the sbfm/ubfm
bitfield move instructions. sxtb, sxth, uxtb, uxth
work with either a 32- or 64-bit destination, and sxtw x0, w1
with a 64-bit destination only.
The GNU assembler at least also supports uxtw w0, w1
and uxtw x0, w1
, although the official Architecture Reference Manual does not document them. But they are both just aliases for mov w0, w1
, since writes to 32-bit registers always zero the high half of the corresponding 64-bit register. (And a fun fact is that mov w0, w1
is itself an alias for orr w0, wzr, w1
, a bitwise OR with the zero register.)
There are no mnemonics for the trivial uxtx, sxtx
which would just be a 64-bit move. I suppose logically uxtx x0, x1
could be an alias of ubfm x0, x1, #0, #63
, encoded as 0xd340fc20
, but they didn't bother to support it. The uxtx
operator to adds
is needed because otherwise there would be no way to assemble 0x2b226c20
, but since 0xd340fc20
can already be obtained with ubfm
it doesn't need another redundant name. (Actually it seems ubfm x0, x1, #0, #63
disassembles as lsr x0, x1, #0
, since the immediate shift instructions are also aliases for bitfield move.) Likewise, the useless sxtw w0, w1
is also rejected by the assembler.
The extended-register addressing modes for the load, store, and prefetch instructions. They normally take 64-bit base and index registers ldr x0, [x1, x2]
, but the index can also be specified as a 32-bit register with either zero or sign extension: ldr x0, [x1, w2, uxtw]
or ldr x0, [x1, w2, sxtw]
.
Here there is again a redundant encoding that appears. These instructions contain a 3-bit "option" field with the same position and format as for add
and friends, but here the byte and half-word versions are unsupported, so the encodings with bit 14 = 0 are undefined. Of the remaining four combinations, uxtw (010)
and sxtw (110)
make perfect sense. The other two use a 64-bit index with no extension, and so have the same effect as each other, but they need to be assigned distinct assembly syntax. The 110
encoding, which might logically be uxtx
, is designated the "preferred" encoding and is written with no operator as ldr x0, [x1, x2]
, or ldr x0, [x1, x2, lsl #3]
for the shifted-index the shifted version. The redundant 111
encoding is then selected with ldr x0, [x1, x2, sxtx]
or ldr x0, [x1, x2, sxtx #3]
The uxtl/sxtl
Extend Long SIMD instructions, which zero- or sign-extend the elements of a vector to double their original width. These are actually aliases for the ushll/sshll
long shift instructions, with a shift count of 0. But otherwise there is nothing unusual about their encodings.
uxtx
/sxtx
as part of addressing modes (godbolt.org/z/4G5c6ProM) to allow compilers to avoid doing sign-extension when code uses anint
as an array index with a 64-bit pointer, but wasn't aware of this usage. I assume it's the same as addressing-modes wheresxtw #2
is sign-extend and left-shift by 2 (e.g. to index anint
array, vs. justsxtx
to not shift when indexing a char array). So perhaps for a 32-bit add, there are redundant ways to encode a left-shift, as sign- or zero-extending? Not posting an answer since I didn't check the manuals. – Minnaadd
instruction specifies a call toExtendReg
but no mention of what happens whenExtendReg
returns a 64-bit value to the following 32-bit addition. So, should my decompiler truncate the 64-bit result blindly or not? – Selfexecutingsxtw
, notsxtx
. (IDK what the difference is either, and would be interested to read an answer explaining the design of AArch64's sign/zero extension stuff.) – Minnauxtx / sxtx / uxtw / sxtw
all have the same effect, they allow you to select which of the four possible encodings you want, for the rare situations where it matters. – CorbittExtendReg
, for the case ofadds w5, w19, w0, uxtx #3
it actually does returnbits(32)
, so there's no type mismatch. The parameterN
here isdatasize
which is 32, andN
is what is passed toExtend
. And in any case, because of thelen = Min(len, N - shift)
on the second-to-last line, you can see that whetherlen
is initially set to 32 or 64 byUXTW/UXTX
, the overall effect doesn't change. – Corbitt