Is the size of a struct required to be an exact multiple of the alignment of that struct?
Asked Answered
D

9

23

Once again, I'm questioning a longstanding belief.

Until today, I believed that the alignment of the following struct would normally be 4 and the size would normally be 5...

struct example
{
  int   m_Assume_32_Bits;
  char  m_Assume_8_Bit_Bytes;
};

Because of this assumption, I have data structure code that uses offsetof to determine the distance in bytes between two adjacent items in an array. Today, I spotted some old code that was using sizeof where it shouldn't, couldn't understand why I hadn't had bugs from it, coded up a unit test - and the test surprised me by passing.

A bit of investigation showed that the sizeof the type I used for the test (similar to the struct above) was an exact multiple of the alignment - ie 8 bytes. It had padding after the final member. Here is an example of why I never expected this...

struct example2
{
  example m_Example;
  char    m_Why_Cant_This_Be_At_Offset_6_Bytes;
};

A bit of Googling showed examples that make it clear that this padding after the final member is allowed - for example http://en.wikipedia.org/wiki/Data_structure_alignment#Data_structure_padding (the "or at the end of the structure" bit).

This is a bit embarrassing, as I recently posted this comment - Use of struct padding (my first comment to that answer).

What I can't seem to determine is whether this padding to an exact multiple of the alignment is guaranteed by the C++ standard, or whether it is just something that is permitted and that some (but maybe not all) compilers do.

So - is the size of a struct required to be an exact multiple of the alignment of that struct according to the C++ standard?

If the C standard makes different guarantees, I'm interested in that too, but the focus is on C++.

Dovecote answered 9/1, 2011 at 5:2 Comment(2)
Yes, it is required. You couldn't do dynamic allocation (using malloc) of arrays if it wasn't.Humming
@Ben - +1 your nice to-the-point answer below but, on your comment, if you could assume that the items in an array were padded the same as same-type fields in a struct (with no members of other types), you could calculate your per-item padded size for the array as the offsetof the second member in a struct with two members of that type (and nothing else). That's what I have been mistakenly doing.Dovecote
M
10

One definition of alignment size:

The alignment size of a struct is the offset from one element to the next element when you have an array of that struct.

By its nature, if you have an array of a struct with two elements, then both need to have aligned members, so that means that yes, the size has to be a multiple of the alignment. (I'm not sure if any standard explicitly enforce this, but because the size and alignment of a struct don't depend on whether the struct is alone or inside an array, the same rules apply to both, so it can't really be any other way.)

Miscreated answered 9/1, 2011 at 5:6 Comment(11)
Why can't the compiler add padding between the struct instances in an array? Is there some direct (not indirectly implied) guarantee, perhaps based on old idioms with casted-to-char* pointer arithmetic, that each struct instance will start sizeof(T) chars on from the previous one in the array?Dovecote
Because if it does extra padding for arrays but not normal instances, then you can't iterate through it with a pointer. (Not sure if that answered your question though?)Miscreated
@Lambert - why not? Iterating using a pointer is surely specified to be compatible with the array, and that doesn't need to mean (though it may) any particular relationship between the pointer increment distance in bytes and the sizeof an item in the array. The sizeof one item may simply be different to the padded array stride.Dovecote
I see what you mean, but then how useful would "sizeof" be? What could you possibly use it for? (Edit: I just noticed a typo in my post, that apparently no one noticed! That definition is the definition of "size", not a definition of alignment; I fixed it.)Miscreated
@Lambert - I'd say sizeof was still usable for allocating memory using malloc or whatever. That was what it mostly seemed to be used for in C, with pointer arithmetic being less common than simply needing a heap allocated instance. But in the C++ context, I take the point that sizeof is certainly less directly useful in my old belief system.Dovecote
@Steve314: But if you use them for malloc, you use them for calloc too, and wouldn't that mean that it has to give you the size that's useful for an array?Miscreated
@Lambert - the thing is that alignment became an issue relatively recently, or at least it seems that way to me (I have no idea what 60s mainframes did). So it would be reasonable to say "sizeof didn't allow for alignment - programmer habits need to change, but pointer arithmetic should be rare enough that we'll just make it the programmers responsibility, though maybe an extra function might be worthwhile". And considering we still don't have an official alignof, the current lack of an array_stride_of or whatever says very little. BTW - I did +1 your answer, and you may well get the accept.Dovecote
Say we allow for some separate operator strideof which returns 8 for this struct, while the sizeof is 5. Arrays auto-pad with 8-5 = 3 bytes in between instances. Now, say we dynamically allocate a single instance. It must be aligned to a 4-byte boundary. Now, what about the 3 bytes immediately following the allocation? Will we ever be able to use them? Only for values that only require 1- or 2-byte alignment, and even then it will be tricky.Chalmers
@Karl - the whole point of padding is that it's not used. But there are cases where there can be a saving in memory by not padding structs to a multiple of the alignment. There's an abstraction cost for nesting structs within structs (compared with a single flat struct with the same atomic members) given the standard padding rules - my old-belief rules are never less memory efficient and never break alignment, but they can be more memory efficient than the standard ones. By not padding that 5-byte struct up to 8 bytes, sometimes you can put something in those 3 bytes in the next struct layer.Dovecote
@Steve314: On the flip side, suppose a struct contains an int32, a short16, and a char8, in that order, on a machine where the former has 32-bit alignment. If the structure is padded out to 8 bytes it may be copied using two 32-bit operations, rather than one 32-bit, one 16-bit, and one 8-bit operation. Such an optimization would not work, however, if the structure could encapsulated in another structure that nestled a byte in the unused space. Personally, I think C should standardize a means of forcing structure alignment to be looser or stricter than standard, so that e.g...Aguilera
...a structure containing only 8-bit char values could nonetheless be aligned so as to allow struct assignments to use word-copy rather than byte-copy operations, or code which needs to use packed objects could let the compiler optimize operations with them as best it can.Aguilera
J
21

5.3.3/2

When applied to a class, the result [of sizeof] is the number of bytes in an object of that class, including any padding required for placing objects of that type in an array.

So yes, object size is a multiple of its alignment.

Jaimie answered 9/1, 2011 at 8:16 Comment(1)
Ben Voigt's citation is as authoritative as mine. If they were in contradiction, it would be an issue.Jaimie
M
10

One definition of alignment size:

The alignment size of a struct is the offset from one element to the next element when you have an array of that struct.

By its nature, if you have an array of a struct with two elements, then both need to have aligned members, so that means that yes, the size has to be a multiple of the alignment. (I'm not sure if any standard explicitly enforce this, but because the size and alignment of a struct don't depend on whether the struct is alone or inside an array, the same rules apply to both, so it can't really be any other way.)

Miscreated answered 9/1, 2011 at 5:6 Comment(11)
Why can't the compiler add padding between the struct instances in an array? Is there some direct (not indirectly implied) guarantee, perhaps based on old idioms with casted-to-char* pointer arithmetic, that each struct instance will start sizeof(T) chars on from the previous one in the array?Dovecote
Because if it does extra padding for arrays but not normal instances, then you can't iterate through it with a pointer. (Not sure if that answered your question though?)Miscreated
@Lambert - why not? Iterating using a pointer is surely specified to be compatible with the array, and that doesn't need to mean (though it may) any particular relationship between the pointer increment distance in bytes and the sizeof an item in the array. The sizeof one item may simply be different to the padded array stride.Dovecote
I see what you mean, but then how useful would "sizeof" be? What could you possibly use it for? (Edit: I just noticed a typo in my post, that apparently no one noticed! That definition is the definition of "size", not a definition of alignment; I fixed it.)Miscreated
@Lambert - I'd say sizeof was still usable for allocating memory using malloc or whatever. That was what it mostly seemed to be used for in C, with pointer arithmetic being less common than simply needing a heap allocated instance. But in the C++ context, I take the point that sizeof is certainly less directly useful in my old belief system.Dovecote
@Steve314: But if you use them for malloc, you use them for calloc too, and wouldn't that mean that it has to give you the size that's useful for an array?Miscreated
@Lambert - the thing is that alignment became an issue relatively recently, or at least it seems that way to me (I have no idea what 60s mainframes did). So it would be reasonable to say "sizeof didn't allow for alignment - programmer habits need to change, but pointer arithmetic should be rare enough that we'll just make it the programmers responsibility, though maybe an extra function might be worthwhile". And considering we still don't have an official alignof, the current lack of an array_stride_of or whatever says very little. BTW - I did +1 your answer, and you may well get the accept.Dovecote
Say we allow for some separate operator strideof which returns 8 for this struct, while the sizeof is 5. Arrays auto-pad with 8-5 = 3 bytes in between instances. Now, say we dynamically allocate a single instance. It must be aligned to a 4-byte boundary. Now, what about the 3 bytes immediately following the allocation? Will we ever be able to use them? Only for values that only require 1- or 2-byte alignment, and even then it will be tricky.Chalmers
@Karl - the whole point of padding is that it's not used. But there are cases where there can be a saving in memory by not padding structs to a multiple of the alignment. There's an abstraction cost for nesting structs within structs (compared with a single flat struct with the same atomic members) given the standard padding rules - my old-belief rules are never less memory efficient and never break alignment, but they can be more memory efficient than the standard ones. By not padding that 5-byte struct up to 8 bytes, sometimes you can put something in those 3 bytes in the next struct layer.Dovecote
@Steve314: On the flip side, suppose a struct contains an int32, a short16, and a char8, in that order, on a machine where the former has 32-bit alignment. If the structure is padded out to 8 bytes it may be copied using two 32-bit operations, rather than one 32-bit, one 16-bit, and one 8-bit operation. Such an optimization would not work, however, if the structure could encapsulated in another structure that nestled a byte in the unused space. Personally, I think C should standardize a means of forcing structure alignment to be looser or stricter than standard, so that e.g...Aguilera
...a structure containing only 8-bit char values could nonetheless be aligned so as to allow struct assignments to use word-copy rather than byte-copy operations, or code which needs to use packed objects could let the compiler optimize operations with them as best it can.Aguilera
H
8

The standard says (section [dcl.array]:

An object of array type contains a contiguously allocated non-empty set of N subobjects of type T.

Therefore there is no padding between array elements.

Padding inside structures is not required by the standard, but the standard doesn't permit any other way of aligning array elements.

Humming answered 9/1, 2011 at 8:13 Comment(3)
Array elements don't need to be aligned to anything in particular. The compiler only aligns them to make access more efficient. So the compiler only needs to pad if it wants to align to a boundary greater than 1.Interferon
@Martin: All objects in C++ are required by the standard to be well-aligned. Of course, the compiler could arbitrarily decide that every type only needs to be 1-byte aligned, but most compilers use the CPU's underlying alignment requirements, and whichever alignment the compiler uses, it has to make sure that objects respect it, in arrays and everywhere else.Foetor
@Martin: The standard doesn't require things to be aligned, but hardware often does. And the standard says that the only way of doing alignment in an array is to increase the size of the element data type, leaving holes between elements is forbidden.Humming
F
4

C++ doesn't explicitly says so, but it is a consequence of two other requirements:

First, all objects must be well-aligned.

3.8/1 says

The lifetime of an object of type T begins when [...] storage with the proper alignment and size for type T is obtained

and 3.9/5:

Object types have *alignnment requirements (3.9.1, 3.9.2). The alignment of a complete object type is an implementation-defined integer value representing a number of bytes; an object is allocated at an address that meets the alignment requirements of its object type.

So every object must be aligned according to its alignment requirements.

The other requirement is that objects in an array are allocated contigulously:

8.3.4/1:

An object of array type contains a contiguously allocated non-empty set of N subobjects of type T.

For the objects in an array to be contiguously allocated, there can be no padding between them. But for every object in the array to be properly aligned, each individual object must be padded so that the byte immediately after the end of the object is also well aligned. In other words, the size of the object must be a multiple of its alignment.

Foetor answered 9/1, 2011 at 14:5 Comment(0)
E
3

I am unsure if this is in the actual C/C++ standard, and I am inclined to say that it is up to the compiler (just to be on the safe side). However, I had a "fun" time figuring that out a few months ago, where I had to send dynamically generated C structs as byte arrays across a network as part of a protocol, to communicate with a chip. The alignment and size of all the structs had to be consistent with the structs in the code running on the chip, which was compiled with a variant of GCC for the MIPS architecture. I'll attempt to give the algorithm, and it should apply to all variants of gcc (and hopefully most other compilers).

All base types, like char, short and int align to their size, and they align to the next available position, regardless of the alignment of the parent. And to answer the original question, yes the total size is a multiple of the alignment.

// size 8
struct {
    char A; //byte 0
    char B; //byte 1
    int C; //byte 4
};

Even though the alignment of the struct is 4 bytes, the chars are still packed as close as possible.

The alignment of a struct is equal to the largest alignment of its members.

Example:

//size 4, but alignment is 2!
struct foo {
    char A; //byte 0
    char B; //byte 1
    short C; //byte 3
}

//size 6
struct bar {
    char A;         //byte 0
    struct foo B;   //byte 2
}

This also applies to unions, and in a curious way. The size of a union can be larger than any of the sizes of its members, simply due to alignment:

//size 3, alignment 1
struct foo {
    char A; //byte 0
    char B; //byte 1
    char C; //byte 2
};

//size 2, alignment 2
struct bar {
    short A; //byte 0
};

//size 4! alignment 2
union foobar {
    struct foo A;
    struct bar B;
}

Using these simple rules, you should be able to figure out the alignment/size of any horribly nested union/struct you come across. This is all from memory, so if I have missed a corner case that can't be decided from these rules please let me know!

Electrodynamometer answered 9/1, 2011 at 5:59 Comment(0)
G
2

It is possible to produce a C or C++ typedef whose alignment is not a multiple of its size. This came up recently in this bindgen bug. Here's a minimal example, which I'll call test.c below:

#include <stdio.h>
#include <stdalign.h>

__attribute__ ((aligned(4))) typedef struct {
    char x[3];
} WeirdType;

int main() {
    printf("sizeof(WeirdType) = %ld\n", sizeof(WeirdType));
    printf("alignof(WeirdType) = %ld\n", alignof(WeirdType));
    return 0;
}

On my Arch Linux x86_64 machine, gcc -dumpversion && gcc test.c && ./a.out prints:

9.3.0
sizeof(WeirdType) = 3
alignof(WeirdType) = 4

Similarly clang -dumpversion && clang test.c && ./a.out prints:

9.0.1
sizeof(WeirdType) = 3
alignof(WeirdType) = 4

Saving the file as test.cc and using g++/clang++ gives the same result. (Update from a couple years later: I get the same results from GCC 11.1.0 and Clang 13.0.0.)

Notably however, MSVC on Windows does not seem to reproduce any behavior like this.

Ghee answered 23/3, 2020 at 17:52 Comment(2)
The original question was about structs rather than typedefs, so assuming no other counterexamples exist, the super pedantic answer might be "yes, sizeof(struct X) is always a multiple of alignof(struct X)." However, I think most people considering this question actually want to know whether it's true for any type, in which case the answer is evidently "no, sizeof(Y) is not necessarily a multiple of alignof(Y)."Stagecoach
Fun fact: this particular combination makes it impossible to make an array of this type: WeirdType data[2]; results in error: alignment of array elements is greater than element size.Panatella
G
1

So to split your question up into two:

1. Is it legal?

[5.3.3.2] When applied to a class, the result [of the sizeof() operator] is the number of bytes in an object of that class including any padding required for placing objects of that type in an array.

So, no, it's not.

2. Well, why isn't it?

Here, I cna only speculate.

2.1. Pointer arithmetics get weirder
If alignment would be "between array elements" but would not affect the size, zthigns would get needlessly complicated, e.g.

(char *)(X+1) != ((char *)X) + sizeof(X)

(I have a hunch that this is required implicitely by the standard even without above statement, but I can't put it to proof)

2.2 Simplicity
If alignment affects size, alignment and size can be decided by looking at a single type. Consider this:

struct A  {  int x; char y;  }
struct B  { A left, right;   }

With the current standard, I just need to know sizeof(A) to determine size and layout of B.
With the alternate you suggest I need to know the internals of A. Similar to your example2: for a "better packing", sizeof(example) is not enough, you need to consider the internals of example.

Gallbladder answered 9/1, 2011 at 5:42 Comment(11)
OK, but I'd have just said a padded_sizeof or array_stride_of would be handy. The advantage is in a struct like my example2 from the question - 6 bytes with alignment 4 gives 8 bytes per item, rather than 12 bytes with the two sets of 3 bytes padding. With a big enough array of those, that 33% saving can easily be significant for locality and memory bandwidth as well as the space savings.Dovecote
Also, on your "right" member - if that were an int field, you'd need to know "inner details" (the alignment as well as the size). Why not do the same with struct members that is done for any other type member? If for some reason you get alignment 4 on a 2-byte integer (e.g. some compiler-specific alignment setting feature), that too will need some extra padding between two adjacent fields of that type.Dovecote
see my update. I agree it would be nice for cases like your example2.Gallbladder
@Steve: but as soon as you place your 6-byte struct in an array, it has to be padded no matter what. Otherwise, every other member of the array would be unaligned. So the only difference would be whether the padding is "inside" the struct or between structs. C++ opted for the simpler, more consistent solution, that every struct must have a size so that it can be placed directly in an array.Foetor
@jalf - My 6 byte struct has to be padded to 8 bytes and, as you say, whether those 2 bytes are inside or outside the struct makes no difference. But the standard rules version is 8 bytes already for the padded inner struct. The extra member and padding makes that 12 for the outer struct. Nest structs rather than use a single flat struct and, occasionally, it makes a difference - though obviously the example is contrived. BTW - I'm explaining my thinking, not complaining. Trade-offs are a fact of life, and I was in the wrong for letting a rationalisation fool me into believing a mistake.Dovecote
@Steve: true. And if this extra memory usage really matters, it is fairly simple to work around. But in most cases, it's not worth it. You might as well complain that a bool uses a whopping 8 times more memory than it needs. But you could arrive at all sorts of weird and surprising behavior if you allow a struct to have a different size depending on when and how it is used. Your struct when placed in an array would have an effective size of 8 byte (a pointer would have to point 8 bytes further ahead to point to the next element), but it'd only contribute 6 bytes as a member of a structFoetor
@jalf - Looks like you didn't see my delete+replace "edit" before replying. On the bool example, thats a very good comparison, and I agree. In C++, if you really care, you can lay out your data structures pretty much any way you want. You just have to manage that layout yourself, in much the same way an assembler programmer would have to. The odds of having a good enough reason to do that is small, of course.Dovecote
@jalf - BTW my struct would have different size depending on where used - there'd simply be more or less padding between members/array items/whatever - just as there is now with other types.Dovecote
Correcting the slip in the last comment - I know arrays don't have padding now, but if they did have, that'd really just make them more consistent with structs and unions. It would break the old usage of sizeof as the array stride for pointer arithmetic, which is a big price to pay for old code maintenance, of course - that's the trade-off thing I mentioned before.Dovecote
@Steve: Arrays shouldn't be consistent with structs and unions. It's a pretty important characteristic of arrays that we can rely on their memory layout, unlike with structs which are pretty much "the way the compiler likes". All to save 2 bytes here and there? Even my 4 year old laptop has two billion bytes to spend!Foetor
@jalf - Do you believe I claimed otherwise? Please read where I said "Trade-offs are a fact of life, and I was in the wrong for letting a rationalisation fool me into believing a mistake", or where I said "which is a big price to pay for old code maintenance, of course". No-one needs to convince me my rationalisation was wrong - I already know. Describing it doesn't mean I'm trying to claim it's a better way - describing it only means I'm describing it.Dovecote
S
0

The standard says very little about padding and alignment. Very little is guaranteed. About the only thing you can bet on is that the first element is at the beginning of the structure. After that...alignment and padding can be anything.

Strephon answered 9/1, 2011 at 5:15 Comment(6)
So my sizeof based code works by fluke? And (more worrying) the padding between two adjacent same-type structs in a larger struct may not be the same as the padding between two instances of the same type in an array, if some compiler decides to be strange (meaning that perhaps I should replace my array-stride calculations)?Dovecote
The distance between the beginning of type T in an array is guaranteed to be exactly sizeof(T) from the start of the previous. The same guarantee is not given for within any class structure.Strephon
Not just the first element; all elements are guaranteed in the same order. So don't interleave chars with longs to avoid unnecessary padding.Deathwatch
@Philip: AFAIK elements with the same accessibility are guaranteed to be in order but there is no guarantee for the relationship of elements with different accessibilities, the compiler could choose to regroup all elements with the same accessibility or to interleave them or whatever.Womenfolk
@Matthieu: argh, was thinking in C not C++. Don't know C++ well enough to say for sure.Deathwatch
@Philip: I guess the rules hold for C too, except that for C all elements have the same visibility (public). Honestly I never liked that rule much, I suppose it was meant for backward compatibility, but I'd much prefer a compiler which can "compact" the class instead of having to worry about the layout when I am writing it :/ Of course, it means greater control (with regard to cache line, or ABI compatibility)Womenfolk
O
0

Seems the C++03 standard didn't say (or I didn't find) whether the alignment padding bytes should be included in the object representation.

And the C99 standard says the "sizeof" a struct type or union type includes internal and trailing padding, but I'm not sure if all alignment padding is included in that "trailing padding".

Now back to your example. There is really no confusion. sizeof(example) == 8 means the structure does take 8 bytes to represent itself, including the tailing 3 padding bytes. If the char in the second structure has an offset of 6, it will overwrite the space used by m_Example. The layout of a certain type is implementation-defined, and should be kept stable in the whole implementation.

Still, whether p+1 equals (T*)((char*)p + sizeof(T)) is unsure. And I'm hoping to find the answer.

Osage answered 9/1, 2011 at 7:27 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.