This is more of a theoretical question. I'm familiar with how padding and trailing padding works.
struct myStruct{
uint32_t x;
char* p;
char c;
};
// myStruct layout will compile to
// x: 4 Bytes
// padding: 4 Bytes
// *p: 8 Bytes
// c: 1 Byte
// padding: 7 Bytes
// Total: 24 Bytes
There needs to be padding after x
, so that *p
is aligned, and there needs to be trailing padding after c
so that the whole struct size is divisible by 8 (in order to get the right stride length).
But consider this example:
struct A{
uint64_t x;
uint8_t y;
};
struct B{
struct A myStruct;
uint32_t c;
};
// Based on all information I read on internet, and based on my tinkering
// with both GCC and Clang, the layout of struct B will look like:
// myStruct.x: 8 Bytes
// myStruct.y: 1 Byte
// myStruct.padding: 7 Bytes
// c: 4 Bytes
// padding: 4 Bytes
// total size: 24 Bytes
// total padding: 11 Bytes
// padding overhead: 45%
// my question is, why struct A does not get "inlined" into struct B,
// and therefore why the final layout of struct B does not look like this:
// myStruct.x: 8 Bytes
// myStruct.y: 1 Byte
// padding 3 Bytes
// c: 4 Bytes
// total size: 16 Bytes
// total padding: 3 Bytes
// padding overhead: 19%
Both layouts satisfy alignments of all variables. Both layouts have the same order of variables. In both layouts struct B
has correct stride length (divisible by 8 Bytes). Only thing that differs (besides 33% smaller size), is that struct A
does not have correct stride length in layout 2, but that should not matter, since clearly there is no array of struct A
s.
I checked this layout in GCC with -O3 and -g, struct B
has 24 Bytes.
My question is - is there some reason why this optimization is not applied? Is there some layout requirement in C/C++ that forbids this? Or is there some compilation flag I'm missing? Or is this an ABI thing?
EDIT: Answered.
- See answer from @dbush on why compiler cannot emit this layout on it's own.
- The following code example uses GCC pragmas
packed
andaligned
(as suggested by @jaskij) to manualy enforce the more optimized layout. StructB_packed
has only 16 Bytes instead of 24 Bytes (note that this code might cause issues/run slow when there is an array of structsB_packed
, be aware and don't blindly copy this code):
struct __attribute__ ((__packed__)) A_packed{
uint64_t x;
uint8_t y;
};
struct __attribute__ ((__packed__)) B_packed{
struct A_packed myStruct;
uint32_t c __attribute__ ((aligned(4)));
};
// Layout of B_packed will be
// myStruct.x: 8 Bytes
// myStruct.y: 1 Byte
// padding for c: 3 Bytes
// c: 4 Bytes
// total size: 16 Bytes
// total padding: 3 Bytes
// padding overhead: 19%
Bytes
instead of suffixB
. Dropping the suffix completely would be much more confusing in my opinion and not correct. – Dagnahsizeof
always returns a value in bytes, so whenever we are talking about sizes of data types, padding etc, we are always talking about bytes. – Troddenint
followed by a 1 bytechar
, what happens if you put that type into an array? – Huxpacked
attribute. I say cursed, because it leads to unaligned access. Although you can couplepacked
andaligned
to specify any and all alignment you wish. – Lumper