Understanding the ADR instruction in ARM, and adding an offset to that
Asked Answered
U

2

5

I was looking at the assembler output of my code and need help with below instructions.

    0x00000fe8:    e28fc000    ....    ADR      r12,{pc}+8 ; 0xff0

    0x00000fec:    e28cca08    ....    ADD      r12,r12,#8, 20 ; #0x8000

From my understanding the 1st instruction causes r12 to be loaded with {pc value} + 8 that is
"{Address of current instruction in execution (0xfe8) plus 2 instructions ahead (8)} + 8"

so is r12 after 1st instruction execution loaded with 0xff8 (0xfe8+8+8)

Also regarding the 2nd instruction -
How to calculate the value being added and stored to r12? (the comment says its 0x8000, though i am not able to understand how it got this)

Uneventful answered 27/8, 2010 at 11:16 Comment(0)
B
9

The first instruction (really a pseudo-instruction) loads a PC-relative address into R12. Since the instruction is at address 0xFE8, the expression {pc}+8 evaluates to 0xFF0. So the result of the first instruction is to load the value 0xFF0 into R12. The comment actually indicates this.

(Note that ADR isn't a real ARM instruction, the assembler replaces it with an instruction such as ADD. Also note that this expression's value is calculated at assembly time. During program execution, the PC points ahead of the current instruction, due to the processor's pipeline. How much ahead depends on the architecture (e.g. ARM7, etc.) and the operating mode (Thumb/ARM). ) I'm risking giving "too much information" here about ADR & PC-relative expressions/addressing, but it's easy to get bitten if you don't understand what's going on behind the scenes.)

The second instruction (actually reading from right to left) effectively says "take the constant 0x8, rotate it right by 20 bits (which is the same as a left shift by 12 bits, 32-20 = 12), ADD it to R12 (which currently holds 0xFF0), and store it in R12." 0x8 << 12 = 0x8000, so the 2nd instruction results in R12 holding 0x8000 + 0xFF0 = 0x8FF0.

Note: In your explanation, when you said "2 instructions ahead", don't fall into that habit, think of it as 8 bytes, not 2 instructions. The instruction says add 8 bytes, it doesn't say anything about instructions. Instructions aren't necessarily 4 bytes long (for example, in Thumb, they are 2 bytes; in Thumb2, they are 2 bytes or 4 bytes, depending on the instruction).

Bronez answered 27/8, 2010 at 13:51 Comment(1)
Hi Dan, Thanks a lot! :) Lot of smoke got cleared now!! ;) Yes you are right about the ADR being evaluated to ADD. It boils down to ADD r12,pc,0 Sorry for being naive, (but I am still :( untill I get some hold) So at assemble time {pc}+8 mean I just interpret as the address of the current instruction + 8? Is that what I shall keep in mind..?Uneventful
N
3

I respectfully disagree with Dan, it IS two instructions ahead, that is how the pipeline works. The size of the instruction is either 2 bytes for thumb or 4 bytes for arm, so two instructions ahead does result in either 4 or 8 bytes. It is not an arbitrary X bytes ahead, it is two instruction fetches ahead.

Most folks will just use labels and never have to know how this works. For exception handlers IF you use thumb mode you will have to deal with it and not all versions of the ARM ARM are clear on this, some versions simply say that the return register holds address+8 when they mean address+two instructions (which means 4 or 8 depending on the mode which is indicated by the lsbit of the address), over time the ARM ARM improves but older ones have lots of bugs. Most folks wont ever need to know or worry about this two instruction ahead thing.

The main answer to your questions lies in the ARM ARM (the ARM Architectural Reference Manual), in the instruction encoding. In order to have fixed length instructions, meaning all ARM mode instructions are 32 bits, immediate values have to be quite limited. So for many instructions like the add you can only have say 8 "significant bits" and a few bits for shifting. So the number 0x1001 wouldnt work, in binary this value is 0b0001000000000001. The first and last non-zero bits (significant bits) require 13 bits of storage. but the 0x8000 in your example has only 1 significant bit so that can easily be stored and shifted in a number of ways in the instruction. For instruction sets that have variable length instructions, x86 for example, you can have complete immediates, you can load or add the value 0x12345678 because that 0x12345678 is not encoded in the main opcode itself it follows the opcode in memory and can be of varying sizes to meet the needs of the instruction set. There are pros and cons to fixed and variable length that is beyond this discussion. The point being though the ARM ARM not only includes bit field definitions but each instruction has pseudo code explaining how the different bit fields are used, including things like the pc being two fetches ahead of the currently executing instruction.

The pc relative addressing is not something you normally deal with the limited immediates you will deal with all the time, it is good to know which instructions have what immediate lengths. It gets more difficult with thumb mode than arm mode to remember which operations allow what sized immediates.

Nigritude answered 27/8, 2010 at 18:6 Comment(3)
Thanks Dwelch for elaborated explanation! :)Uneventful
also respectfully, I think you misunderstood my comment about how to think about the +8. (Or maybe I misunderstood your comment?) +8 is equivalent to +2 instructions only because we're in ARM mode. If you're in Thumb mode, +8 means +4 instructions. How many instructions +8 translates to depends on the CPU operating mode. That same line of code, executed in Thumb mode, will still skip 8 bytes, but now it skips 4 instructions. The only thing you can say without more context is that it skips 8 bytes. Pipelining affects how far ahead the PC points, but not what +8 means.Bronez
Probably a better way to explain what I meant: in C, if "fooptr" is a ptr variable, then fooptr++ literally means "advance by one foo". If it points to char, it is prob. 1 byte. If it points to an int, it probably advances by 4 bytes. But the C code literally means "advance by one foo", regardless of what that translates to. My point is that the "+8" literally means "+8 bytes", regardless of how many instructions that translates to. I'm struggling to explain it any better than that.Bronez

© 2022 - 2024 — McMap. All rights reserved.