Trailing padding in C/C++ in nested structures - is it neccesary?

Asked 22/12, 2022 at 0:44 Answered 22/12, 2022 at 17:22

Solved c++c padding compiler-optimization structlayout

This is more of a theoretical question. I'm familiar with how padding and trailing padding works.

struct myStruct{
    uint32_t x;
    char*    p;
    char     c;
};

// myStruct layout will compile to
// x:       4 Bytes
// padding: 4 Bytes
// *p:      8 Bytes
// c:       1 Byte
// padding: 7 Bytes
// Total:   24 Bytes

There needs to be padding after x, so that *p is aligned, and there needs to be trailing padding after c so that the whole struct size is divisible by 8 (in order to get the right stride length). But consider this example:

struct A{
    uint64_t x;
    uint8_t  y;
};

struct B{
    struct A myStruct;
    uint32_t c;
};

// Based on all information I read on internet, and based on my tinkering
// with both GCC and Clang, the layout of struct B will look like:
// myStruct.x:       8 Bytes
// myStruct.y:       1 Byte
// myStruct.padding: 7 Bytes
// c:                4 Bytes
// padding:          4 Bytes
// total size:       24 Bytes
// total padding:    11 Bytes
// padding overhead: 45%

// my question is, why struct A does not get "inlined" into struct B,
// and therefore why the final layout of struct B does not look like this:
// myStruct.x:       8 Bytes
// myStruct.y:       1 Byte
// padding           3 Bytes
// c:                4 Bytes
// total size:       16 Bytes
// total padding:    3 Bytes
// padding overhead: 19%

Both layouts satisfy alignments of all variables. Both layouts have the same order of variables. In both layouts struct B has correct stride length (divisible by 8 Bytes). Only thing that differs (besides 33% smaller size), is that struct A does not have correct stride length in layout 2, but that should not matter, since clearly there is no array of struct As.

I checked this layout in GCC with -O3 and -g, struct B has 24 Bytes.

My question is - is there some reason why this optimization is not applied? Is there some layout requirement in C/C++ that forbids this? Or is there some compilation flag I'm missing? Or is this an ABI thing?

EDIT: Answered.

See answer from @dbush on why compiler cannot emit this layout on it's own.
The following code example uses GCC pragmas packed and aligned (as suggested by @jaskij) to manualy enforce the more optimized layout. Struct B_packed has only 16 Bytes instead of 24 Bytes (note that this code might cause issues/run slow when there is an array of structs B_packed, be aware and don't blindly copy this code):

struct __attribute__ ((__packed__)) A_packed{
    uint64_t x;
    uint8_t  y;
};

struct __attribute__ ((__packed__)) B_packed{
    struct A_packed myStruct;
    uint32_t c __attribute__ ((aligned(4)));
};

// Layout of B_packed will be
// myStruct.x:       8 Bytes
// myStruct.y:       1 Byte
// padding for c:    3 Bytes
// c:                4 Bytes
// total size:       16 Bytes
// total padding:    3 Bytes
// padding overhead: 19%

Dagnah answered 22/12, 2022 at 0:44 Comment(5)

Point taken. I have edited the question with Bytes instead of suffix B. Dropping the suffix completely would be much more confusing in my opinion and not correct. – Dagnah 22/12, 2022 at 1:5

Well, that's a matter of opinion I suppose. sizeof always returns a value in bytes, so whenever we are talking about sizes of data types, padding etc, we are always talking about bytes. – Trodden 22/12, 2022 at 1:6

It's propably a remnant of studying electrical engineering in uni, seeing a number without a unit next to it looks like a sin to me – Dagnah 22/12, 2022 at 1:32

What happens when you make an array of your type? Consider a type that has a 4-byte int followed by a 1 byte char, what happens if you put that type into an array? – Hux 22/12, 2022 at 2:32

Note that all your size assumptions are for 64 bit architecture (perhaps x76_64 specifically). 32 bit isn't dead yet, and will be around for quite a few years. Hell, 16 bit stuff is used in places. That said, maybe it's a rule here to just assume a modern laptop/desktop/server processor? What you actually want to look at is the (somewhat cursed) packed attribute. I say cursed, because it leads to unaligned access. Although you can couple packed and aligned to specify any and all alignment you wish. – Lumper 22/12, 2022 at 21:8

is there some reason why this optimization is not applied

If this were allowed, the value of sizeof(struct B) would be ambiguous.

Suppose you did this:

struct B b;
struct A a = { 1, 2 };
b.c = 0x12345678;
memcpy(&b.myStruct, &a, sizeof(struct A));

You'd be overwriting the value of b.c.

Frostwork answered 22/12, 2022 at 1:3 Comment(5)

Right, didn't though about manual memory handling. But teoreticaly, if compiler could prove that this won't happen, this optimization would be safe, right? – Dagnah 22/12, 2022 at 1:16

The compiler cannot prove this won't happen. Read about the Halting problem. – Trodden 22/12, 2022 at 1:50

@Dagnah if struct a is defined by itself in one file, there's no way for the compiler to know that struct b was defined in a different unrelated file. – Frostwork 22/12, 2022 at 2:10

I'd actually recommend reading about Rice's theorem directly instead of the Halting problem. The compiler may sometimes decide to lay out the struct differently or not put it into the memory at all, e.g. if there is no manual memory management and it's very short-lived and there is no struct at all. However, one won't be able to detect that from inside the program in any way due to the "as-if" rule. – Holocaust 22/12, 2022 at 3:30

@yeputons: The optimization you're talking about is a case of scalar replacement of aggregates, SROA or SRA for short. It's not really changing the struct layout, it's optimizing away the struct entirely and just working with the members. GCC can do this even between functions, as described in the 2010 paper The new intraprocedural Scalar Replacement of Aggregates, like the -fipa-sra optimization option. (Just a random google hit for "scalar replacement of aggregates"). Java and Javascript JITs do this, too. – Technical 22/12, 2022 at 4:17

Padding is used to force alignment. Now if you have an array of struct myStruct, then there is a rule that array elements follow each other without any padding. In your case, without padding inside myStruct after the last field, the second myStruct in an array wouldn't be properly aligned. Therefore it is necessary that sizeof(myStruct) is a multiple of the alignment of myStruct, and for that you may need enough padding at the end.

Divisive answered 22/12, 2022 at 17:22 Comment(1)

Yes, but as you can see, struct B does not contain any array of struct A (myStructs), so this is irrelevant. I'm talking about a special case when you have struct inside a struct, and it is not an array and therefore you don't need the trailing padding to force proper aligment of the next struct. – Dagnah 24/12, 2022 at 23:49

Recommended topics

Hot tags