What is the calling convention for floating-point values in C for x86_64 in System V?
Asked Answered
W

1

6

I'm currently doing a deep-dive into Assembly land, mainly from the perspective of x86_64, C, and System V AMD64, generally targeting Linux.

It's pretty straightforward that the calling convention for integer (and by implication, pointer) values by using the following registers in order:

  • RDI
  • RSI
  • RDX
  • RCX
  • R8
  • R9
  • XMM0–7

Longer argument counts are handled by pushing values onto the stack frame of the subroutine. I got these register names from the Wikipedia page on x86_64 calling conventions.

For larger values like structs and arrays, the convention also seems to be to push into the callee's stack frame.

However, what is the calling convention for floating-point arguments to functions? Are floating point registers used?

Another related question: what if I have mixed argument types?

void mixed(int a, float b, mystruct c) { /* ... */ }

If my function takes an arg list like this, how do I call such a function from Assembly? Which registers are used in interleaved arg lists like this?

Wolter answered 9/9, 2019 at 19:3 Comment(14)
raw.githubusercontent.com/wiki/hjl-tools/x86-psABI/…Idel
The actual ABI documentation describes all of this in detail. TL;DR: floating point arguments are passed in the XMM registers. "Interleaving" is done by picking the next available register appropriate for the type. Also, you can make a test case and study compiler output. In your example a goes into rdi, b goes into xmm0 and c we can't tell :)Lagging
values like structs and arrays -- arrays are never passed to a function. They are always promoted to a pointer to the type. Unless ofcourse when they are inside a struct. Then it follows the struct passing ABI.Epode
Also larger arguments like structs are not pushed on the stack, they are generally copied somewhere on the caller's stack frame and a pointer is passed. Same for struct returns. Space is allocated for the return value on the caller's stack frame and an extra argument (pointer to the space allocated) is passed as (typically the first) argument.Epode
@AjayBrahmakshatriya : not quite true. The windows 64-bit calling convention passes structs by value on the stack, but the System V 64-bit ABI will attempt to place structs passed by value in registers governed by a recursive algorithm defined in the ABI. Some structs can be passed entirely in registers, although it is all or nothing. If a particular struct is found to not be passable in registers the entire thing is passed on the stack (never a combination of stack and registers)Quaver
@MichaelPetch right, I meant this for larger structs that do not fit in registers. I should have clarified it in the comment.Epode
I recently touched on some of the issues in this question in a recent Stackoverflow answer. It isn't a duplicate of this, but there is a discussion about your last question passing structs (by value) to a function: #57767193 . That answer also has links to the ABs and relevant sections that define the rules for the calling convention, and how to return data from a function.Quaver
@AjayBrahmakshatriya : I was unsure if you meant that or not. I just was letting you know had you meant they weren't ever passed in registers. No harm done.Quaver
what happened when you tried it? Please show your experiments and results.Spermatophore
@old_timer: The complete answer to this question cannot reasonably be determined by experiment.Meridith
@EricPostpischil 1) of course it can, think about it. 2) by starting with an experiment you can jump start your way through the documentation. Saves a great deal of time. 3) For specific function prototypes that you are interested in the experiment covers 100% of what you need to know, no reason to look at a spec.Spermatophore
@old_timer: 1) No, it cannot, think about it. How to pass arguments is a function of, theoretically, an infinite number of possibilities, and, practically, too many possibilities to reasonably experiment on. 2) and 3) are irrelevant as they do not answer the question asked.Meridith
While a "deep-dive" can be useful for educational purposes, assembly might not be your best choice for production code. If seeking performance, in addition to knowing the purpose of each of the assembler instructions, you'd also need to know their latency, and the effect each one has on the surrounding instructions, etc. C compilers compute these things for you to (usually) produce good output, using decades worth of accumulated tricks and experience. It's possible to beat the compiler under certain circumstances, but the time, effort and (esp) maintenance often make asm impractical IRL.Tartuffe
@David Wohlferd absolutely. I'm just seeking to be able to understand things at the lowest level above hardware. Compilers are almost ALWAYS better than us at making machine code.Wolter
W
6

The calling convention for parameter passing is specified in the System V Application Binary Interface for AMD64PDF documentation in section 3.2.3.

I'm not sure if the documentation can be legally quoted here, but I can at least paraphrase.

Classification Types

First, the documentation defines eight different classifications for parameter values:

  • INTEGER: integer types and pointers which use the general purpose registers
  • SSE: types that use vector registers.
  • SSEUP: similar to SSE but primarily used to store upper bytes of large (>=128-bit) values
  • X87: floating point types.
  • X87UP: the upper bytes of large floating point types.
  • COMPLEX_X87: registers for complex floating point types.
  • NO_CLASS: padding areas and for empty structures and unions, typically in memory on the stack.
  • MEMORY: types that are exclusively passed on the stack in main memory.

Classification Rules

It next defines how C types fit into these classifications:

  • _Bool, char, short, int, long, long long, and pointers are classified as INTEGER and will use those registers.
  • float, double, _Decimal32, _Decimal64, and __m64 are classified as SSE and will use those registers.
  • __float128, _Decimal128, and __m128 are split in half, storing the least significant bytes/bits in SSE and the most significant bytes/bits in SSEUP.
  • __m256 is split into four 64-bit (8 byte) values, with the least significant bytes being stored as SSE and the rest as SSEUP
  • __m512 is similarly split into 64-bit (8 byte) chunks, with the least significant bytes stored as SSE and everything else as SSEUP
  • long double values store their 64-bit mantissa as X87 and the 16-bit exponent is padded to 64-bits (8 bytes) and stored in X87UP.
  • __int128 is essentially stored as two long values in INTEGER with the first half being the low bits/bytes and the second half being the high bits/bytes. They can be understood as if they were defined as a struct:

    typedef struct {
      long low_bits, high_bits;
    } __int128;
    
  • complex double and complex float types are split in half, with the first half being the real component and the second half being the imaginary component, and are stored in SSE. They can be understood as if they were defined as a struct like so:

    typedef struct {
      double real, imaginary;
    } complex_double;
    
  • complex long double values are classified as COMPLEX_X87.
  • The logic for structs, unions, and arrays is fairly complicated, consult the documentation linked above for more information. In a nutshell, there is a recursive algorithm defined for how to pass aggregate types that decides how values are passed.

Argument Passing

Now that we have a classification system and a recursive algorithm for dealing with structs, unions, and arrays, we apply this system and algorithm to the parameters to a function, which consists of the following steps for each argument:

  • If it's a MEMORY object, write it to the stack.
  • If it's an INTEGER, use the next available register from %rdi, %rsi, %rdx, %rcx, %r8, and %r9.
  • If it's SSE, use the next available register in the range %xmm0 to %xmm7.
  • If it's SSEUP, use the next available 64-bit chunk of the last-used %xmm register for SSE types.
  • If it's X87, X87UP, or COMPLEX_X87, it's passed in memory.

Rinse and repeat for all argument values. If you run out of registers for a given type, write to the stack.


TL;DR There is a non-trivial, but fairly straightforward algorithm defined by the System V ABI for passing different types of data.

Wolter answered 9/9, 2019 at 22:40 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.