What defines the size of a type?
Asked Answered
T

4

5

The ISO C standard says that:

sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long)

I am using GCC-8 on BIT Linux mint (19.1) and the size of long int is 8.

I am using an app which uses GCC 7 and the compiler is 64-bit. The size of long int is 4. Does the compiler or the operating system define the size of a long int?

Terle answered 15/5, 2019 at 19:37 Comment(4)
This is probably helpful, but it's not an answer itself: en.wikipedia.org/wiki/64-bit_computingDynamics
To the best of my knowledge, all versions of GCC use 8-byte long ints when compiling Linux x86_64 binaries. On the other hand, they all use 4-byte long ints when compiling Linux x86 binaries. It's not about the form or host of the compiler, it's about the target.Knickerbocker
I'm guessing that Windows is probably used with GCC 7. Windows has the LLP64 data model, which defines long as 32-bit (4 bytes).Analogy
The C standard also requires sizeof(char) == 1, sizeof(short)*CHAR_BIT >= 16, sizeof(int)*CHAR_BIT >= 16, sizeof(long)*CHAR_BIT >= 32.Amorita
M
6

The compiler calls all the shots. The operating system just runs the resulting binary.

That being said, the compiler will normally make an executable the operating system can use, so there's some interplay here. Since things like the size of int don't really matter so long as they're consistent, you will see variation.

In other words, if the kernel expects long int to be 8 bytes because of how it was compiled, then you'll want to compile that way to match or your compiled code won't match and none of the shared libraries will work.

Medarda answered 15/5, 2019 at 19:40 Comment(14)
This is a good answer, but I would like to know more about why the same compiler uses different sizes on the same arhitecture with the same Compiler. If the operating system is 64BIT I can not understand the difference.Terle
Changing trends. People used to be a lot more sensitive about making long int 64-bits back when computers had less memory, but now it's no big deal. 64-bit CPUs have been around long enough that 128MB used to be a ton of memory. Over time a lot of things have gotten bigger because there's less pressure to shave bytes.Medarda
The compiler calls all the shots. Not really. Data sizes are specified by the ABI.Dynamics
It's worth noting that more modern C code often uses different constants to avoid ambiguity, such as those introduced in C99 like int64_t.Medarda
@Medarda it is not the truth of course. If you compile to run on the 64 bit system long will be 8 bits. If you compile to be run on the 32 bits system - it will be 4,\ This (and much more) is called ABI to let compiler know how to generate the code.,Algor
@AndrewHenle True, but that's often dictated by what the compiler does, since in the case of Linux it's an ouroboros, compiler compiling the ABI, and the ABI constraining the compiler.Medarda
@Medarda ABI is not constraining the compiler. Obeying the ABI makes the program runable on the particular systemAlgor
@P__J__ I'm speaking here to how Linux, as one example, has been hugely influenced in its design by GCC and its capabilities.Medarda
@Medarda Pedantically, a compiler produces binaries compatible with the target ABI. But since ABIs and compilers tend to come from the same people at the same time (one without the other isn't very useful unless you like assembler), it's de facto more of a chicken-and-egg problem to say which comes first and which one "rules".Dynamics
@AndrewHenle Said it much better than I did. Thanks.Medarda
@P__J__ See also Microsoft Windows and Microsoft's C++ compiler, SunOS and Sun's C compiler, IRIX and SGI's C compiler, HP/UX and HP's C compiler and I could list like a hundred more. There's a symbiotic relationship here as Andrew did a good job of explaining.Medarda
@Michi: Re “I would like to know more about why the same compiler uses different sizes on the same arhitecture with the same Compiler. If the operating system is 64BIT I can not understand the difference.”: The operating system can only control how software interfaces with it. If there is an operating system routine that needs to be passed a 32-bit integer, then you have to pass it a 32-bit integer. But, inside the program, the program (and the compiler) can do anything they want. If a compiler wants to call a 32-bit integer an int and call a 48-bit integer a gromitz, then it can.Lemon
@Michi: Think about writing software. You can use a lot of different code to implement effectively the same loop. Similarly, the compiler can generate whatever instructions it needs to make integers or other types behave the way it wants. It is a computer—a general purpose computer—so it can be programmed in practically infinitely many different ways. When writing the compiler, we can make the char, short, int, and long types in a program behave any way we want, just by writing the right programming.Lemon
The driver for different implementations for the same target to name their types the same way is not ABI, as @EricPostpischil has said several times now. It is the desire for interoperability and binary compatibility. Implementations must agree about how they name their types, else it is difficult for code (e.g. libraries) built by or intended for use with one to be used with another. That is, however, a quality of implementation issue, not any kind of requirement.Knickerbocker
D
4

The Application Binary Interface for an operating system/architecture specifies the sizes of basic types:

ABIs cover details such as (bolding mine):

  • a processor instruction set (with details like register file structure, stack organization, memory access types, ...)
  • the sizes, layouts, and alignments of basic data types that the processor can directly access
  • the calling convention, which controls how functions' arguments are passed and return values are retrieved; for example, whether all parameters are passed on the stack or some are passed in registers, which registers are used for which function parameters, and whether the first function parameter passed on the stack is pushed first or last onto the stack
  • how an application should make system calls to the operating system and, if the ABI specifies direct system calls rather than procedure calls to system call stubs, the system call numbers
  • and in the case of a complete operating system ABI, the binary format of object files, program libraries and so on.
Dynamics answered 15/5, 2019 at 19:38 Comment(6)
The compiler decides the mapping from its type names to the types in the ABI.Lemon
@EricPostpischil Feel free to edit. I suspect you know a lot more about this than I do. ;-)Dynamics
@EricPostpischil I just changed my answer to a community wiki. It's all yours to improve.Dynamics
tadman’s answer is correct.Lemon
@EricPostpischil But that just means the compiler is using types provided by the ABI. The choices a compiler has are constrained, so the ABI can't be ignored.Dynamics
The ABI (which, by the way, is voluntary) can only specify how to pass integers of 8, 16, 32, or other numbers of bits, how to pass floating-point objects of various sizes, how to pass structures, and so on. It cannot control what they are called in the programming language. If long is a 32-bit integer in the programming language, it will be passed as a 32-bit integer in the ABI. If long is a 64-bit integer in the programming language, it will be passed as a 64-bit integer in the ABI. The ABI never sees the name. The compiler decides what the name means.Lemon
A
3

This is left to the discretion of the implementation.

It's the implementation (compiler and standard library) that defines the size of long, int, and all other types.

As long as they fit the constraints given by the standard, the implementation can make all the decisions as to what sizes the types are (possibly with the exception of pointers).

Analogy answered 15/5, 2019 at 20:15 Comment(0)
M
2

TL/DR - the exact size is up the compiler.


The Standard requires that a type be able to represent a minimum range of values - for example, an unsigned char must be able to represent at least the range [0..255], an int must be able to represent at least the range [-32767...32767], etc.

That minimum range defines a minimum number of bits - you need at least 16 bits to represent the range [-32767..32767] (some systems may use padding bits or parity bits that are part of the word, but not used to represent the value).

Other architectural considerations come into play - int is usually set to be the same size as the native word size. So on a 16-bit system, int would (usually) be 16 bits, while on a 32-bit system it would be 32 bits. So, ultimately, it comes down the the compiler.

However, it's possible to have one compiler on a 32-bit system use a 16-bit int, while another uses a 32-bit int. That led to a wasted afternoon back in the mid-90s where I had written some code that assumed a 32-bit int that worked fine under one compiler but broke the world under a different compiler on the same hardware.

So, lesson learned - never assume that a type can represent values outside of the minimum guaranteed by the Standard. Either check against the contents of limits.h and float.h to see if the type is big enough, or use one of the sized types from stdint.h (int32_t, uint8_t, etc.).

Marxmarxian answered 15/5, 2019 at 21:28 Comment(1)
Note: Standard also requires sizeof(char)==1 in addition to various type range requirements.Amorita

© 2022 - 2024 — McMap. All rights reserved.