Printing floating point numbers from x86-64 seems to require %rbp to be saved - McMap

About

Printing floating point numbers from x86-64 seems to require %rbp to be saved

Asked 19/4, 2013 at 4:23 Answered 19/4, 2013 at 4:40

Solved assembly floating-point x86-64

D

1

9

When I write a simple assembly language program, linked with the C library, using gcc 4.6.1 on Ubuntu, and I try to print an integer, it works fine:

        .global main
        .text
main:
        mov     $format, %rdi
        mov     $5, %rsi
        mov     $0, %rax
        call    printf
        ret
format:
        .asciz  "%10d\n"

This prints 5, as expected.

But now if I make a small change, and try to print a floating point value:

        .global main
        .text
main:
        mov     $format, %rdi
        movsd   x, %xmm0
        mov     $1, %rax
        call    printf
        ret
format:
        .asciz  "%10.4f\n"
x:
        .double 15.5

This program seg faults without printing anything. Just a sad segfault.

But I can fix this by pushing and popping %rbp.

        .global main
        .text
main:
        push    %rbp
        mov     $format, %rdi
        movsd   x, %xmm0
        mov     $1, %rax
        call    printf
        pop     %rbp
        ret
format:
        .asciz  "%10.4f\n"
x:
        .double 15.5

Now it works, and prints 15.5000.

My question is: why did pushing and popping %rbp make the application work? According to the ABI, %rbp is one of the registers that the callee must preserve, and so printf cannot be messing it up. In fact, printf worked in the first program, when only an integer was passed to printf. So the problem must be elsewhere?

Divebomb answered 19/4, 2013 at 4:23 Comment(4)

Out of interest, what's the purpose of that mov into %rax? – Necrose 19/4, 2013 at 4:37

Count of floating point arguments, IIRC. – Martyrology 19/4, 2013 at 4:45

Related: you can't directly print a float with printf, only double (with "%f") or long double because of C promotion rules for variadic functions: How to print a single-precision float with printf. – Projectionist 6/6, 2018 at 7:31

Also related; glibc printf only cares about stack alignment when %al != 0, because that's how gcc compiles variadic functions that might accept FP args. printf float in nasm assembly 64-bit shows that printf happens to not crash when called with a misaligned stack and RAX=0, and the answer shows gcc's code (which runs only for non-zero AL) that dumps xmm0..7 to the stack with movaps (variadic functions can accept __m128 args, too, not just double.) – Projectionist 6/6, 2018 at 7:47

N

10

I suspect the problem doesn't have anything to do with %rbp, but rather has to do with stack alignment. To quote the ABI:

The ABI requires that stack frames be aligned on 16-byte boundaries. Speciﬁcally, the end of the argument area (%rbp+16) must be a multiple of 16. This requirement means that the frame size should be padded out to a multiple of 16 bytes.

The stack is aligned when you enter main(). Calling printf() pushes the return address onto the stack, moving the stack pointer by 8 bytes. You restore the alignment by pushing another eight bytes onto the stack (which happen to be %rbp but could just as easily be something else).

Here is the code that gcc generates (also on the Godbolt compiler explorer):

.LC1:
        .ascii "%10.4f\12\0"
main:
        leaq    .LC1(%rip), %rdi   # format string address
        subq    $8, %rsp           ### align the stack by 16 before a CALL
        movl    $1, %eax           ### 1 FP arg being passed in a register to a variadic function
        movsd   .LC0(%rip), %xmm0  # load the double itself
        call    printf
        xorl    %eax, %eax         # return 0 from main
        addq    $8, %rsp
        ret

As you can see, it deals with the alignment requirements by subtracting 8 from %rsp at the start, and adding it back at the end.

You could instead do a dummy push/pop of whatever register you like instead of manipulating %rsp directly; some compilers do use a dummy push to align the stack because this can actually be cheaper on modern CPUs, and saves code size.

Necrose answered 19/4, 2013 at 4:40 Comment(4)

I think you're right - I've run into similar problems myself. The reason it works for an integer is just luck of the draw. Undefined behaviour and all that. OP's first example doesn't work on my machine without a stack adjustment, either. I just used sub $8, %rsp, though. – Martyrology 19/4, 2013 at 4:45

Sometimes the number of pushes done may depend, so the stack may be aligned to 16 bytes or not. A logical AND for sp or spl works always, like this: and spl,0xf0. – Malvaceous 19/4, 2013 at 6:24

@Necrose Nice answer. I was hoping it was something like that. I got the %push rbp idea from writing the code in C and doing gcc -S. I know the 32-bit assembly push %ebp; mov %esp, %ebp stack-frames very well, and figured gcc's pushing of %rbp was a relic of those old days and would have been shocked if it actually mattered. Thanks for the note about the alignment; I will go back and study the ABI docs much more carefully! – Divebomb 19/4, 2013 at 18:41

Is that exact gcc output for some platform? Normally you get .LC1 (with a leading dot, so it's a GAS local label), and Linux ELF systems don't use leading underscores. Is that from MacOS X, which uses the x86-64 SysV calling convention and _ on symbol names? It'd probably be better to match the OP's code for Ubuntu, so it doesn't look like _main instead of main is a correction / part of the answer. e.g. godbolt.org/g/2PKKAP has the output from gcc4.6.4 -O3, and uses .LC... and no _, but is otherwise identical to your answer except for instruction order. – Projectionist 18/4, 2018 at 22:7

Recommended topics

#Godot #Unity #Godot 4.X #Mongodb

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

© 2022 - 2024 — McMap. All rights reserved.