Why does this MOVSS instruction use RIP-relative addressing? [duplicate]
Asked Answered
N

1

9

I found the following assembly code in disassembler (floating point logic c++).

  842: movss  0x21a(%rip),%xmm0 

I understand that when process rip will allways be 842 and this 0x21a(%rip) will be const. It seems a little odd to use this register.

I want to know is there any advantage of using rip relative address, instead other addressing.

Nasty answered 7/7, 2017 at 9:18 Comment(0)
S
21

RIP is the instruction pointer register, which means that it contains the address of the instruction immediately following the current instruction.

For example, consider the following code:

mov  rax, [rip]
nop

In the first line of code there, RIP points to the next instruction, so it points at the NOP. Thus, this code loads the address of the NOP instruction into the RAX register.

As such, it is not the case that RIP is simply a constant. Your understanding that RIP in this process "will always be 842" is not correct. The value of RIP will change, depending on where the code has been loaded into memory. 842 is just the line number, pulled from your debugging symbols; once code is compiled into a binary, it doesn't have line numbers anymore. :-)

In your disassembly, the constant is the offset (0x21A). That's the offset from the current value in RIP. Another way of writing this is: %rip + 0x21A.

RIP-relative addressing is a new form of effective addressing introduced with 64-bit long mode. The point is that it makes it easier to write position-independent code because you can make any memory reference RIP-relative. In fact, RIP-relative addressing is the default addressing mode in 64-bit applications. Virtually all instructions that address memory in 64-bit mode are RIP-relative. I'll quote from Ken Johnson (aka Skywing)'s blog because I couldn't say it any better myself:

One of the larger (but often overlooked) changes to x64 with respect to x86 is that most instructions that previously only referenced data via absolute addressing can now reference data via RIP-relative addressing.

RIP-relative addressing is a mode where an address reference is provided as a (signed) 32-bit displacement from the current instruction pointer. While this was typically only used on x86 for control transfer instructions (call, jmp, and soforth), x64 expands the use of instruction pointer relative addressing to cover a much larger set of instructions.

What’s the advantage of using RIP-relative addressing? Well, the main benefit is that it becomes much easier to generate position independent code, or code that does not depend on where it is loaded in memory. This is especially useful in today’s world of (relatively) self-contained modules (such as DLLs or EXEs) that contain both data (global variables) and the code that goes along with it. If one used flat addressing on x86, references to global variables typically required hardcoding the absolute address of the global in question, assuming the module loads at its preferred base address. If the module then could not be loaded at the preferred base address at runtime, the loader had to perform a set of base relocations that essentially rewrite all instructions that had an absolute address operand component to refer to take into account the new address of the module.

[ . . . ]

An instruction that uses RIP relative addressing, however, typically does not require any base relocations (otherwise known as “fixups”) at load time if the module containing it is relocated, however. This is because as long as portions of the module are not internally re-arranged in memory (something not supported by the PE format), any addresses reference that is both relative to the current instruction pointer and refers to a location within the confines of the current image will continue to refer to the correct location, no matter where the image is placed at load time.

As a result, many x64 images have a greatly reduced number of fixups, due to the fact that most operations can be performed in an RIP-relative fashion.

He's speaking in the context of Windows, but something conceptually similar applies on other operating systems as well.

The code you have is loading a constant value, stored somewhere in the binary image, into the XMM0 register, and it's doing so using RIP-relative addressing because of its many advantages.

Serious answered 7/7, 2017 at 9:32 Comment(4)
One overlooked reason for the RIP-relative addressing IMO is that there are no 64-bit displacements for the SIB byte and the only istructions that take a moffs64 as immediate are mov rax, moffs64 (A1) and mov moffs64, rax (A3). I believe that RIP-relative addressing was introduced to avoid having 64-bit immediates rather than for pursuing PIC, which is a side effect. After all, this is how this problem has been solved in RISC as well. However these are just my two cents.Electra
@MargaretBloom: When AMD was designing AMD64, the overhead of PIC for i386 was very well known (especially for Linux where it ties up a register for a pointer to the GOT), and I think putting a significant amount of code in libraries was already starting to happen back in 2000. I'm sure the PIC benefit was something they had in mind, as well as allowing code+data to be efficient even when loaded outside the low 2 or 4GB of virtual address space.Herzl
Virtually all instructions that address memory in 64-bit mode are RIP-relative. - true for instructions that address static data directly. (Global variables and static stuff). But I think most memory accesses in most code are through pointers in registers, either the stack pointer for locals that aren't in registers, or through pointer variables. And BTW, indexing static arrays can't use RIP-relative addressing for the actual load or store. e.g. [arr + rax*4] if arr can be an absolute disp32, otherwise a RIP-relative LEA into a reg them [rcx + rax*4]Herzl
Note: the number OP sees is not "the line number, pulled from your debugging symbols" - I don't know any disassembler that uses that kind of representation or even does such kind of inspection by default. The number shown is just the offset (in bytes and in hexadecimal notation) from the start of the .text section (which for a position independent executable is going to be set to zero).Bradawl

© 2022 - 2024 — McMap. All rights reserved.