64-bit syscall documentation for MacOS assembly
Asked Answered
F

2

5

I'm having trouble finding the good documentation for writing 64-bit assembly on MacOS.

The 64-bit SysV ABI says the following in section A.2.1 and this SO post quotes it:

  • A system-call is done via the syscall instruction. The kernel destroys registers %rcx and %r11.

  • Returning from the syscall, register %rax contains the result of the system-call. A value in the range between -4095 and -1 indicates an error, it is -errno.

Those two sentences are ok on Linux but are wrong on macOS Sierra with the following code:

global _start
extern _exit

section .text
_start:

; Align stack to 16 bytes for libc
and rsp, 0xFFFFFFFFFFFFFFF0

; Call write
mov rdx, 12             ; size
mov rsi, hello          ; buf
mov edi, 1              ; fd
mov rax, 0x2000004      ; write ; replace to mov rax, 0x1 on linux
syscall

jc .err                 ; Jumps on error on macOS, but why?
jnc .ok

.err:
mov rdi, -1
call _exit              ; exit(-1)

.ok:
; Expect rdx to be 12, but it isn't on macOS!
mov rdi, rdx
call _exit              ; exit(rdx)

; String for write
section .data
hello:
.str db `Hello world\n`
.len equ $-hello.str

Compile with NASM:

; MacOS: nasm -f macho64 syscall.asm && ld syscall.o -lc -macosx_version_min 10.12 -e _start -o syscall
; Linux: nasm -f elf64 syscall.asm -o syscall.o && ld syscall.o -lc -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o syscall

Run on macOS:

./syscall      # Return value 0
./syscall >&-  # Return value 255 (-1)

I found out that:

  • A syscall return errno an sets the carry flag on error, instead of returning -errno in rax
  • rdx register is clobbered by syscall
  • On Linux, everything works as expected

Why is rdx clobbered? Why doesn't a syscall return -errno? Where can I find the real documentation?

The only place I found where someone talks about the carry flag for syscall errors is here

Flocculent answered 15/12, 2017 at 14:37 Comment(5)
Because this an excerpt from a section entitled A.2 AMD64 Linux Kernel Conventions?Tavy
Read this idryman.org/blog/2014/12/02/writing-64-bit-assembly-on-mac-os-xTavy
Good question. I made a quick edit to the SO post you link to, so it no longer claims that section also applies to *BSD (It didn't mention OS X before, does Darwin count as a *BSD? @Jean-BaptisteYunès, do you happen to know if FreeBSD or OpenBSD use the same convention as Linux, or as OS X, on x86-64?)Souterrain
@PeterCordes Alas it seems that different BSD flavor doesn't use the same ABI. FreeBSD seems to be Linux compatible. Minix and NetBSD are compatible. It is difficult to find information about it (I didn't read about those things since many years, so may be I just don't know where to search). One more unix.stackexchange.com/questions/3322/…Tavy
Also related: sigsegv.pl/osx-bsd-syscallsSouterrain
W
4

I used this:

# as hello.asm -o hello.o
# ld hello.o -macosx_version_min 10.13 -e _main -o hello  -lSystem
.section __DATA,__data
str:
  .asciz "Hello world!\n"

.section __TEXT,__text
.globl _main
_main:
  movl $0x2000004, %eax           # preparing system call 4
  movl $1, %edi                   # STDOUT file descriptor is 1
  movq str@GOTPCREL(%rip), %rsi   # The value to print
  movq $13, %rdx                  # the size of the value to print
  syscall

  movl %eax, %edi
  movl $0x2000001, %eax           # exit (return value of the call to write())
  syscall

and was able to catch return value into eax. Here return value is the number of bytes actually written by write system call. And yes MacOS being a BSD variant it is the carry flag that tells you if the syscall was wrong or not (errno is just an external linkage variable).

# hello_asm.s
# as hello_asm.s -o hello_asm.o
# ld hello_asm.o -e _main -o hello_asm
.section __DATA,__data
str:
        .asciz "Hello world!\n"
good:
        .asciz "OK\n"

.section __TEXT,__text
.globl _main
_main:
        movl $0x2000004, %eax           # preparing system call 4
        movl $5, %edi                   # STDOUT file descriptor is 5
        movq str@GOTPCREL(%rip), %rsi   # The value to print
        movq $13, %rdx                  # the size of the value to print
        syscall

        jc err

        movl $0x2000004, %eax           # preparing system call 4
        movl $1, %edi                   # STDOUT file descriptor is 1
        movq good@GOTPCREL(%rip), %rsi  # The value to print
        movq $3, %rdx                   # the size of the value to print
        syscall
        movl $0, %edi
        movl $0x2000001, %eax           # exit 0
        syscall
err:    
        movl $1, %edi
        movl $0x2000001, %eax           # exit 1
        syscall

This will exits with error code one because descriptor 5 was used, if you try descriptor 1 then it will work printing another message and exiting with 0.

Wile answered 15/12, 2017 at 16:10 Comment(3)
Your last comment should say "exit 1" since you movl $1 into %edi.Openandshut
movq str@GOTPCREL(%rip), %rsi is ridiculous. Just use a RIP-relative LEA to get the address of your own static data! And your comment about exit 0 is wrong: you're actually passing the return value of write(), which is either an error (e.g. if stdout is closed) or 13 (the number of bytes written). You also don't need to 0-terminate the data because you're only using it with explicit-length functions. Then you could let the assembler calculate the length for you instead of having to hardcode 13. See Hello World in x86 assembler on Mac 0SXSouterrain
Is there any official or widely recognized documentation of the MacOS system call conventions, including the use of the carry flag? I couldn't find anything authoritative when searching, just a lot of "folklore".Gadhelic
S
1

I don't know why rdx gets clobbered, just to confirm that it indeed does seem to get zeroed across the "write" systemcall. I examined the status of every register:

global _start
section .text
_start:

mov rax, 0xDEADBEEF; 0xDEADBEEF = 3735928559; 3735928559 mod 256 = 239
mov rbx, 0xDEADBEEF
mov rcx, 0xDEADBEEF
mov rdx, 0xDEADBEEF
mov rsi, 0xDEADBEEF
mov rdi, 0xDEADBEEF
mov rsp, 0xDEADBEEF
mov rbp, 0xDEADBEEF
mov r8, 0xDEADBEEF
mov r9, 0xDEADBEEF
mov r10, 0xDEADBEEF
mov r11, 0xDEADBEEF
mov r12, 0xDEADBEEF
mov r13, 0xDEADBEEF
mov r14, 0xDEADBEEF
mov r15, 0xDEADBEEF

mov rdx, len2           ; size
mov rsi, msg2           ; buf
mov rdi, 1              ; fd
mov rax, 0x2000004      ; write
syscall

mov rdi, rsi            ; CHANGE THIS TO EXAMINE DIFFERENT REGISTERS
mov rax, 0x2000001      ; exit
syscall

section .data
msg_pad db `aaaa\n`     ; to make the buffer not to be page-aligned
msg2 db `bbbbbb\n`      ; because then it's easier to notice whether
len2 equ $-msg2         ; clobbered or not

nasm -f macho64 syscall.asm && ld syscall.o -e _start -static && ./a.out; echo "status: $?"

The results I got:

clobber list of a "write" syscall

rax     clobbered
rbx     not clobbered
rcx     clobbered
rdx     clobbered <- This is the unexpected case?!
rsi     not clobbered
rdi     not clobbered
rsp     not clobbered
rbp     not clobbered
r8      not clobbered
r9      not clobbered
r10     not clobbered
r11     clobbered
r12     not clobbered
r13     not clobbered
r14     not clobbered
r15     not clobbered

It would be interesting to know other syscalls zero rdx too, I didn't have the energy to attempt a thorough investigation. But maybe, just to be safe, one should add rdx to the clobber list of all of the MacOS syscalls from now on.

Shagreen answered 18/3, 2022 at 14:2 Comment(2)
You can just single-step the syscall instruction with a debugger which highlights changed registers (like GDB with layout reg). I notice your test values all fit in 32 bits, so you wouldn't notice if any register just got truncated to 32 bit. (Unlikely but possible). You could increment RAX in a loop (reloading from memory) to check multiple system call numbers, with garbage args at least; should be quick to visually check with a breakpoint set at the right place, after you reload all regs with a pattern.Souterrain
I don't know MacOS / Darwin internals, IDK why RDX would get clobbered. That seems weird. IIRC I read something about iOS clobbering an AArch64 register for no apparent reason (to enforce the fact it was reserved), but that might have been asynchronous, not just on syscalls. And that they stopped doing that for AArch64 MacOS. That might be unrelated.Souterrain

© 2022 - 2024 — McMap. All rights reserved.