How to use RIP Relative Addressing in a 64-bit assembly program?
Asked Answered
K

1

37

How do I use RIP Relative Addressing in a Linux assembly program for the AMD64 archtitecture? I am looking for a simple example (a Hello world program) that uses the AMD64 RIP relative adressing mode.

For example the following 64-bit assembly program would work with normal (absolute addressing):

.text
    .global _start

_start:
    mov $0xd, %rdx

    mov $msg, %rsi
    pushq $0x1
    pop %rax
    mov %rax, %rdi
    syscall

    xor %rdi, %rdi
    pushq $0x3c
    pop %rax
    syscall

.data
msg:
    .ascii    "Hello world!\n"

I am guessing that the same program using RIP Relative Addressing would be something like:

.text
    .global _start

_start:
    mov $0xd, %rdx

    mov msg(%rip), %rsi
    pushq $0x1
    pop %rax
    mov %rax, %rdi
    syscall

    xor %rdi, %rdi
    pushq $0x3c
    pop %rax
    syscall

msg:
    .ascii    "Hello world!\n"

The normal version runs fine when compiled with:

as -o hello.o hello.s && ld -s -o hello hello.o && ./hello

But I can't get the RIP version working.

Any ideas?

--- edit ----

Stephen Canon's answer makes the RIP version work.

Now when I disassemble the executable of the RIP version I get:

objdump -d hello

0000000000400078 <.text>:
  400078: 48 c7 c2 0d 00 00 00  mov    $0xd,%rdx
  40007f: 48 8d 35 10 00 00 00  lea    0x10(%rip),%rsi        # 0x400096
  400086: 6a 01                 pushq  $0x1
  400088: 58                    pop    %rax
  400089: 48 89 c7              mov    %rax,%rdi
  40008c: 0f 05                 syscall 
  40008e: 48 31 ff              xor    %rdi,%rdi
  400091: 6a 3c                 pushq  $0x3c
  400093: 58                    pop    %rax
  400094: 0f 05                 syscall 
  400096: 48                    rex.W
  400097: 65                    gs
  400098: 6c                    insb   (%dx),%es:(%rdi)
  400099: 6c                    insb   (%dx),%es:(%rdi)
  40009a: 6f                    outsl  %ds:(%rsi),(%dx)
  40009b: 20 77 6f              and    %dh,0x6f(%rdi)
  40009e: 72 6c                 jb     0x40010c
  4000a0: 64 21 0a              and    %ecx,%fs:(%rdx)

Which shows what I was trying to accomplish: lea 0x10(%rip),%rsi loads the address 17 bytes after the lea instruction which is address 0x400096 where the Hello world string can be found and thus resulting in position independent code.

Kaifeng answered 14/7, 2010 at 20:36 Comment(3)
Why 17 bytes after (0x10 is 16)?Breast
tortall.net/projects/yasm/manual/html/nasm-effaddr.html says: RIP is the instruction pointer register, which contains the address of the location immediately following the current instruction but the lea instruction is seven bytes long, not one.Breast
Related: How to load address of function or label into register covers RIP-relative LEA and the optimization of mov $msg, %esi for non-PIE executables. (Or movabs 64-bit absolute for code models where you have more than 2GiB of code + static data.)Ruminate
M
37

I believe that you want to load the address of your string into %rsi; your code attempts to load a quadword from that address rather than the address itself. You want:

lea msg(%rip), %rsi

if I'm not mistaken. I don't have a linux box to test on, however.

Maverick answered 15/7, 2010 at 21:19 Comment(4)
if using lea msg(%rsp), %rsi instead of lea msg(%rip), %rsi (or any register but not rip) the address of mes label itself gets added not the offset from current provided register value. for example if msg is in address 0x1FF then using lea msg(%rsp), %rsi causes rsi = *(rsp + 0x1FF) not rsi = *((rsp - 0x1FF) + rsp) as disassembler gave 0x10(%rip) because distance from current rip and msg is 0x10 byts. but I can't find in documents that there is a difference in calculation between rip and other registersAnthropologist
@StephenCanon this works on x86_64 what is the equivalent of lea msg(%rip), %rsi in 32 bit assembler?Antecedence
@Zibri: there is no position-independent way, that's why AMD64 added RIP-relative addressing. Under Linux compilers use offsets relative to the GOT. Of course in position-dependent 32-bit code you simply use mov $msg, %esi the same as you would in position-dependent 64-bit code (under Linux where static symbol addresses are known to be in the low 2GiB of virtual address space in non-PIE executables).Ruminate
@user2808671: yes, msg(%rip) is a special case that means symbol with respect to RIP, not absolute address + RIP. The bottom of sourceware.org/binutils/docs/as/i386_002dMemory.html documents this.Ruminate

© 2022 - 2024 — McMap. All rights reserved.