What ABI, if any, restricts the size of [u]intmax_t?
Asked Answered
C

3

24

Starting with the 1999 edition, the ISO C standard defines a standard header <stdint.h> which defines, among other things, the typedefs intmax_t and uintmax_t. These designate, respectively, "a (signed|unsigned) integer type capable of representing any value of any (signed|unsigned) integer type".

For example, if, as is typical, the widest signed and unsigned integer types are long long int and unsigned long long int, both of which are typically 64 bits, then intmax_t and uintmax_t might be defined in <stdint.h> as follows:

typedef long long int intmax_t;
typedef unsigned long long int uintmax_t;

There is a limited set of predefined signed and unsigned integer types, ranging from signed, unsigned, and plain char up to signed and unsigned long long int.

C99 and C11 also permit implementations to define extended integer types, which are distinct from any of the standard types and have names that are implementation-defined keywords.

Both gcc and clang, on some but not all targets, support types __int128 and unsigned __int128. These act like 128-bit integer types, but they are not treated as extended integer types, and the documentation for both compilers states that they do not support any extended integer types. Because these are not integer types as the Standard defines the term, the typedefs intmax_t and uintmax_t are for 64-bit types, not 128-bit types.

None of this violates the C standard (implementations are not required to have any extended integer types, and they're permitted to have arbitrary extensions as long as they don't break any strictly conforming programs). But it seems to me that it would make perfect sense for __int128 and unsigned __int128 to be treated as extended integer types, and for intmax_t and uintmax_t to be 128-bit types.

The rationale for not doing this is that changing the size of intmax_t and uintmax_t would be "an ABI-incompatible change".

The Clang C++ status page says, in footnote (5):

No compiler changes are required for an implementation such as Clang that does not provide any extended integer types. __int128 is not treated as an extended integer type, because changing intmax_t would be an ABI-incompatible change.

(Yes, this primarily discusses C++, but the rules are the same as for C.)

In a gcc bug report, the claim is made that:

sizeof(intmax_t) is fixed by various LP64 ABIs and cannot be changed

In both cases, no reference is given for this claim.

An x86_64 ABI document titled "System V Application Binary Interface, AMD64 Architecture Processor Supplement, Draft Version 0.99.6" does not mention intmax_t or uintmax_t, or even the <stdint.h> header. It does specify sizes and alignments for the predefined integer types (in Figure 3.1).

Finally, my question: Is the claim that the sizes of intmax_t and uintmax_t restricted by an ABI valid? If so, what ABI imposes such a requirement? (And, incidentally, why?)

(In my opinion, such a requirement, if it exists, is unwise. It defeats the purpose of the C standard's permission to define extended integer types, and the intended meaning of intmax_t and uintmax_t. It makes it much more difficult to use 128-bit integer types effectively on systems that support them, while falling back to narrower types on other systems.)

Update: In N2303, titled "intmax t, a way out", Jens Gustedt proposes tweaking the definitions of [u]intmax_t to permit adding extended integer types wider than long long without having to update [u]intmax_t. For example, intmax_t might be a typedef for long long, but the implementation could still provide, say, __int128 as an extended integer type.

References:

Counterproductive answered 28/4, 2015 at 18:51 Comment(1)
There's an argument to be made that 64bits is enough for most uses. Having intmax_t be an extended precision type that requires carry/borrow for add/sub, and even more complexity for mul/div, would lead to more bloated code than necessary. Basically I'm saying that some programmers may assume intmax_t to be no wider than the target can efficiently support. This is not a good argument, since intmax_t is 64bit on most 32bit machines. It would be nice if there was an int128_t, but intmax_t can't change on existing platforms without breaking backwards compat.Mismanage
B
11

As Colonel Thirty Two notes, a compiler unilaterally making this change would break calls between compilation units that pass uintmax_t parameters or return uintmax_t values. Even though the SysV ABI doesn't define how these types are passed, as a matter of practicality maintaining their definitions is part of conforming to the platform ABI.

Even if it weren't for this ABI issue, a compiler still couldn't unilaterally make this change, because it would require matching changes to every targeted platform's C standard library. Specifically, it would at least require updates to the printf and scanf function family, imaxabs, imaxdiv, and strtoimax and strtoumax and their variants.

Baggywrinkle answered 28/4, 2015 at 19:3 Comment(2)
You forgot to mention the preprocessor, that also is supposed to work with [u]intmax_t.Shiksa
@JensGustedt: I definitely didn't intend my list of affected components to be exhaustive =)Baggywrinkle
S
8

Changing types like intmax_t and uintmax_t also changes the ABI of all the programs that use them, as they now refer to different types.

Say you have program A that uses a function in shared library B with a uintmax_t parameter. If GCC changes the definition of uintmax_t and A (but not B) gets recompiled, then uintmax_t in A and uintmax_t in B now refer to two different types, breaking the ABI.

Sigmund answered 28/4, 2015 at 19:0 Comment(5)
It's too bad C never defined a standard way via which prototypes could specify types in absolute terms. How many zillions of dollars would have been saved if the prototype for printf could have specified "convert all integer values to N bits, all floating-point values to M bits, and all pointers to void*"? On a system where the largest integer printf supported was 64 bits, code wanting to print the result of adding 123456 to an int wouldn't have to worry about whether the result would be an int or long, or the sizes of those types, since both would be passed as 64 bits.Fluffy
If the printf function is limited to 64 bits, then even if intmax_t happens to be bigger, the best that could happen when trying to print an intmax_t would be for it to get converted to a 64-bit value and given to printf; that would allow values that fit in 64 bits to prints correctly, regardless of the type of intmax_t.Fluffy
@Fluffy You do realize that C specifies fixed-sized types in stdint.h, and that vararg functions take more than just integers so that truncating only integers would add a weird corner-case?Sigmund
The stdint.h types interact in implementation-dependent ways with other types, both with regard to promotion and the strict aliasing rules. I mentioned printf because that's where problems are most obvious. On most systems where int and long are both 32-bits, printing an int32_t may require either %d or %ld; one will be right and other other will invoke Undefined Behavior. Some such systems require one, and some require the other, but I wouldn't be surprised if at least 10% of extant code for such systems uses the wrong one and only works because implementations are "nice".Fluffy
I used integers for my example because maxint_t was being discussed; the problem is much more severe with floating-point types. I suspect that 90% of code which uses printf with type long double invokes Undefined Behavior, and the fact that such code only works on systems which pass long double and double identically has led many compilers to define long double as 64 bits even when they use 80-bit types for calculations [using extended-precision for calculations is a good thing from both a semantic and efficiency standpoint if and only if it's available as a data type].Fluffy
Q
5

I think the key to understand here is that just because something is not documented in an ABI specification does not mean it is not part of an ABI. As soon as a type is used across a library boundry then it's properties become part of the ABI of that library.

By defining (u)intmax_t in a standard header and using them in functions of the standard library they become part of the ABI of that library whether they are included in any formal ABI specification or not.

This is especially an issue for Unix-like platforms where the C standard library is treated as part of the platform, not part of the compiler.

Now it would be possible to transition this. Printf uses macros for type specifiers, so those macros could be defined differently depending on the size of intmax_t. Macros could similarly be used to map the handful of functions in the standard library to different implementations but it's a bunch of extra work for questionable gains, so it's hardly surprising that gcc took the path of least resistance to adding the functionality they needed.

Quarry answered 25/6, 2018 at 16:4 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.