How can ARM's MOV instruction work with a large number as the second operand?
Asked Answered
E

7

19

I just begin to study ARM assembly language, and am not clear about how to use MOV to transfer an immediate number into a register.

From both the ARM reference manual and my textbook, it's said that range of immediate number following MOV instruction is 0-255. But when I test on my own PC in ADS 1.2 IDE, instruction

MOV     R2, #0xFFFFFFFF

performs well. Isn't number 0xFFFFFFFF out of range according to the specification?

Empyema answered 12/4, 2010 at 18:40 Comment(1)
related: #14047186Cambist
G
17

Remember that the ARM can perform a certain set of manipulations on the immediate value as part of the barrel shifter that is incorporated into the ARM's opcodes.

This little article has one of the clearest explanations of some of the tricks that an ARM assembler can use to fit a large immediate number into the small available space of an ARM instruction:

The article discusses the trick likely used in your specific example of generating a MVN opcode to load the bitwise complement of the immediate value.

These kinds of manipulation can't be done with all immediate values, but the ARM assemblers are supposedly pretty smart about it (and C compilers certainly are). If no shift/complement tricks can be performed, the value will generally be loaded from a PC-relative location or maybe by 'building up' the value from several instructions.

Gable answered 12/4, 2010 at 18:58 Comment(5)
@Michael Thanks for your tips. That's what I want to know! :-)Empyema
Does anyone know why the assembler builds up some values, but will load others directly? Looking around the Newton ROM code (StrongArm 110) has a lot of "one-instructon loads" (like "MOV r1, 0x0c1b518"), but all of my code comes out with "build-up loads" - like the following code:Athene
.. (oops, early posting mistake).. like "MOV r1, 0x0C000000 / ADD r1,r1,0x100000". I presume this might have something to do with exactly how the processor encodes 32-bit values. Is it more efficient for the processor microcode to build up numbers using a single MOV and then ADD's?Athene
@JimWitte It's a trade-off. If you do LDR Rx,=<val> to load a word PC-relative then if the value is in the cache it will be issued (if not available) in a single cycle. If you form a constant from multiple MOV/ADD ops then you spend two cycles to form the same constant.Galley
See here for more details.Goodale
B
14

A single ARM instruction can only encode an immediate constant that can be represented as an 8-bit immediate value, shifted by any even power of two.

However, there is also a MVN instruction, which is like MOV but inverts all the bits. So, while MOV R2, #0xFFFFFFFF cannot be encoded as a MOV instruction, it can be encoded as MVN R2, #0. The assembler may well perform this conversion for you.

Boggs answered 12/4, 2010 at 18:58 Comment(0)
R
3

MOV instruction can either accept imm16 value or Operator2 value (due to instruction length opposed to memory alignment), which must conform any of the following rules (copied from CortexM instruction set manual, X and Y is any hex-value):

  • Any constant that can be produced by shifting an 8-bit value left by any number of bits within a 32-bit word.
  • Any constant of the form 0x00XY00XY .
  • Any constant of the form 0xXY00XY00 .
  • Any constant of the form 0xXYXYXYXY .

This is why 0xFFFFFFFF is accepted (conforms 4th rule).

If you wish to assemble your own 32 bit constant, you can use instruction MOVT, which writes into the upper half of a register.

Respirable answered 5/11, 2014 at 17:0 Comment(2)
Then why wouldn't it allow 0x45454545? Actually 0x00450045 even too.Eve
@JSmyth: did you tell the assembler it could use ARMv7 encodings? GAS defaults to assuming backwards compat, and won't use things that only work on more recent CPUs unless you tell it to.Perlis
B
2

It's somewhat hard to determine if the given constants are within the valid range.

Like Matthew already mentioned, the assembler lends you hand by replacing given instructions with similar, negating ones, like mov/mvn, cmp/cmn, tst/tne etc.

Babita answered 7/11, 2011 at 4:59 Comment(0)
C
1

You may be seeing artifacts from sign-extension of the original value. If the tools you're using to view the disassembly handles the 0..255 as a signed byte, then when it loads it into a larger int type (or register) it will fill all the upper bits with the sign bit of the original value. Or to put it another way, if 0xFF is a signed byte its decimal value is -1. Put that into a 32 bit register and the hex will look like 0xFFFFFFFF, and its decimal value is still -1.

Try using a value without the high bit set, such as 0x7F. Since the sign bit is not set, I'm guessing it will fill the upper bits with zero when loaded into a larger int type register or field.

It's also possible that the compiler/assembler truncates whatever value you provide. I'd consider it a source code error, but assemblers are funny beasts. If you give it 0x7FF, does it compile to 0x7FF (not truncated, and larger than 0..255) or to 0xFFFFFFFF (truncated to 0..255, signed byte)?

Charters answered 12/4, 2010 at 18:48 Comment(0)
W
-1

If you want to move 0xffffffff to a register you can always do:

MOV R0, #-1

because 0xffffffff is the twos-complement representation of -1

Warton answered 31/5, 2019 at 13:24 Comment(2)
Yes, those are the same number for 32-bit 2's complement. But the question is how the assembler can encode that in machine code; it's the exact same problem regardless of how it's represented in the source. If one assembled ok but the other didn't, that would be a bug in your assembler: -1 and 0xffffffff are the same bit-pattern, and the assembler should find a way to encode any bit pattern for which a way exists.Perlis
But anyway, I don't think ARM has a sign-extended immediate encoding for mov because mvn makes that redundant. You can choose what you want the bits outside of the rotated-immediate to be by choosing mov or mvn. That binary choice applies to all the other bits.Perlis
S
-3

One possibility is that the ARM assembler throws away the significant bits of the number and uses only the lowest FF.

The MOV instruction is a staple in many CPU instruction sets, and usually the assembler resolves the size of the target register and the immediate value being supplied.

For example the following MOV instructions from the x86 set are

MOV BL, 80h, ; 8bit
MOV BX, ACACh ;16bit
MOV EBX, 12123434h ; 32bit
Sandisandidge answered 12/4, 2010 at 18:47 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.