Position Independent Code pointing to wrong address
Asked Answered
R

1

2

I have a small example program written in NASM(2.11.08) targeting the macho64 architecture. I'm running OSX 10.10.3:

bits 64

section .data

    msg1    db 'Message One', 10, 0
    msg1len equ $-msg1
    msg2    db 'Message Two', 10, 0
    msg2len equ $-msg2

section .text
    global  _main
    extern  _printf

_main:
    sub     rsp, 8  ; align

    lea     rdi, [rel msg1]
    xor     rax, rax
    call    _printf

    lea     rdi, [rel msg2]
    xor     rax, rax
    call    _printf

    add rsp, 8
    ret

I'm compiling and linking using the following command line:

/usr/local/bin/nasm -f macho64 test2.s
ld -macosx_version_min 10.10.0 -lSystem -o test2 test2.o

When I do an object dump on the test2 executable, this is the relevant snippet(I can post more if I'm wrong!):

0000000000001fb7 <_main>:
1fb7:   48 83 ec 08             sub    $0x8,%rsp
1fbb:   48 8d 3d 56 01 00 00    lea    0x156(%rip),%rdi        # 2118 <msg2+0xf3>
1fc2:   48 31 c0                xor    %rax,%rax
1fc5:   e8 14 00 00 00          callq  1fde <_printf$stub>
1fca:   48 8d 3d 54 00 00 00    lea    0x54(%rip),%rdi        # 2025 <msg2>
1fd1:   48 31 c0                xor    %rax,%rax
1fd4:   e8 05 00 00 00          callq  1fde <_printf$stub>
1fd9:   48 83 c4 08             add    $0x8,%rsp
1fdd:   c3                      retq  

...

0000000000002018 <msg1>:
0000000000002025 <msg2>:

And, finally, the output:

$ ./test2
Message Two
$

My question is, what happened to msg1?

I'm assuming msg1 isn't printed because 0x14f(%rip) is not the correct address (just nulls).

Why is lea edi, [rel msg2] pointing to the correct address, while lea edi, [rel msg1] is pointing past msg2, into NULLs?

It looks like the 0x14f(%rip) offset is exactly 0x100 beyond where msg1 lies in memory (this is true throughout many tests of this problem).

What am I missing here?

Edit: Whichever message (msg1 or msg2) appears last in the .data section is the only message that gets printed.

Roister answered 26/7, 2015 at 18:24 Comment(1)
The resolution here ended up being yasm. I am hardly in a place to say nasm isn't working correctly, but after changing to yasm everything went as expected.Roister
T
1

IDK about the Mach-o ABI, but if it's the same as the SystemV x86-64 ABI GNU/Linux uses, then I think your problem is that you need to clear eax to tell a varargs function like printf that there are zero FP.

Also, lea rdi, [rel msg1] would be a much better choice. As it stands, your code is only position-independent within the low 32bits of virtual address space, because you're truncating the pointers to 32bits.

It appears NASM has a bug. This same problem came up again: NASM 2 lines of db (initialized data) seemingly not working. There, the OP confirmed that the data was present, but labels were wrong, and is hopefully reporting it upstream.

Thymelaeaceous answered 26/7, 2015 at 18:43 Comment(10)
Good catch on the registers. I've updated to rdi, and am clearing rax as well(example code updated). I still have the same problem, but the code is a better.Roister
The usual idiom is xor eax, eax. It's one byte shorter. Writing to a 32bit reg always clears the upper 32. What does the compiler output for a C function that calls printf twice look like, on your platform? Compile with optimization on, so you don't get a bunch of redundant loads/stores.Thymelaeaceous
Also, in gdb, you can check the register values before the call, and make sure the memory contents at that address is msg1. gdb for asm instructions at stackoverflow.com/tags/x86/infoThymelaeaceous
I've taken a look at how gcc compiles under OSX. Aside from the syntax (at&t), it looks the same. Position Independent (leaq L_.str(%rip), %rdi). Then the xorl for %eax. In GDB, there is no data at the [rel msg1] address. Just 0x0's. Since OSX and Linux are almost identical(in this very small example program I mean), I used the same code on linux. Aside from macho64 vs elf64, and specifying an entry point at _start, the code is the same. It works on Linux however.Roister
I'm sure I'm not understanding something about the macho64 layout, and am causing the position independent stuff to be mucked.Roister
@ryanday: You can gcc -S -masm=intel to get the syntax you're familiar with. Maybe there is a problem with your lea, then, if you're getting all-zeros at the address loaded into %rdi. I don't have access to an OSX box, just Linux, but I'll take a look in GDB myself in a few minutesThymelaeaceous
@ryanday: Works for me under GNU/Linux. (after removing the _ from main and printf). I used yasm -f elf64 and gcc rip-rel.o. GDB: x /16c 0x601040 (the value of RDI after the lea) gives 77 'M' 101 'e' 115 's' 115 's' 97 'a' ...Thymelaeaceous
@ryanday: Are you sure your linker is working properly? It has to write the correct offset into the rip-relative instructions after it decides where the .data and .text sections will be relative to each other. (Also note that string constants could go in .rodata, so they can be shared. What you're doing should work, of course.)Thymelaeaceous
Let us continue this discussion in chat.Roister
Just to wrap up, I tried to use yasm after your success. It worked for me, with no other modifications. The problem is occurring with nasm, and not yasm (or the default assembler on osx in at&t syntax).Roister

© 2022 - 2024 — McMap. All rights reserved.