What are aggregates and trivial types/PODs, and how/why are they special?
Asked Answered
L

6

695

This FAQ is about Aggregates and PODs and covers the following material:

  • What are aggregates?
  • What are PODs (Plain Old Data)?
  • More recently, what are trivial or trivially copyable types?
  • How are they related?
  • How and why are they special?
  • What changes for C++11?
Luciana answered 14/11, 2010 at 15:35 Comment(2)
What is POD subset: https://mcmap.net/q/17069/-what-are-pod-types-in-c-duplicateCavetto
Can it be said that the motivation behind these definitions is roughly: POD == memcpy'able, Aggregate == aggregate-initializable?Commorancy
L
730

How to read:

This article is rather long. If you want to know about both aggregates and PODs (Plain Old Data) take time and read it. If you are interested just in aggregates, read only the first part. If you are interested only in PODs then you must first read the definition, implications, and examples of aggregates and then you may jump to PODs but I would still recommend reading the first part in its entirety. The notion of aggregates is essential for defining PODs. If you find any errors (even minor, including grammar, stylistics, formatting, syntax, etc.) please leave a comment, I'll edit.

This answer applies to C++03. For other C++ standards see:

What are aggregates and why they are special

Formal definition from the C++ standard (C++03 8.5.1 §1):

An aggregate is an array or a class (clause 9) with no user-declared constructors (12.1), no private or protected non-static data members (clause 11), no base classes (clause 10), and no virtual functions (10.3).

So, OK, let's parse this definition. First of all, any array is an aggregate. A class can also be an aggregate if… wait! nothing is said about structs or unions, can't they be aggregates? Yes, they can. In C++, the term class refers to all classes, structs, and unions. So, a class (or struct, or union) is an aggregate if and only if it satisfies the criteria from the above definitions. What do these criteria imply?

  • This does not mean an aggregate class cannot have constructors, in fact it can have a default constructor and/or a copy constructor as long as they are implicitly declared by the compiler, and not explicitly by the user

  • No private or protected non-static data members. You can have as many private and protected member functions (but not constructors) as well as as many private or protected static data members and member functions as you like and not violate the rules for aggregate classes

  • An aggregate class can have a user-declared/user-defined copy-assignment operator and/or destructor

  • An array is an aggregate even if it is an array of non-aggregate class type.

Now let's look at some examples:

class NotAggregate1
{
  virtual void f() {} //remember? no virtual functions
};

class NotAggregate2
{
  int x; //x is private by default and non-static 
};

class NotAggregate3
{
public:
  NotAggregate3(int) {} //oops, user-defined constructor
};

class Aggregate1
{
public:
  NotAggregate1 member1;   //ok, public member
  Aggregate1& operator=(Aggregate1 const & rhs) {/* */} //ok, copy-assignment  
private:
  void f() {} // ok, just a private function
};

You get the idea. Now let's see how aggregates are special. They, unlike non-aggregate classes, can be initialized with curly braces {}. This initialization syntax is commonly known for arrays, and we just learnt that these are aggregates. So, let's start with them.

Type array_name[n] = {a1, a2, …, am};

if(m == n)
the ith element of the array is initialized with ai
else if(m < n)
the first m elements of the array are initialized with a1, a2, …, am and the other n - m elements are, if possible, value-initialized (see below for the explanation of the term)
else if(m > n)
the compiler will issue an error
else (this is the case when n isn't specified at all like int a[] = {1, 2, 3};)
the size of the array (n) is assumed to be equal to m, so int a[] = {1, 2, 3}; is equivalent to int a[3] = {1, 2, 3};

When an object of scalar type (bool, int, char, double, pointers, etc.) is value-initialized it means it is initialized with 0 for that type (false for bool, 0.0 for double, etc.). When an object of class type with a user-declared default constructor is value-initialized its default constructor is called. If the default constructor is implicitly defined then all nonstatic members are recursively value-initialized. This definition is imprecise and a bit incorrect but it should give you the basic idea. A reference cannot be value-initialized. Value-initialization for a non-aggregate class can fail if, for example, the class has no appropriate default constructor.

Examples of array initialization:

class A
{
public:
  A(int) {} //no default constructor
};
class B
{
public:
  B() {} //default constructor available
};
int main()
{
  A a1[3] = {A(2), A(1), A(14)}; //OK n == m
  A a2[3] = {A(2)}; //ERROR A has no default constructor. Unable to value-initialize a2[1] and a2[2]
  B b1[3] = {B()}; //OK b1[1] and b1[2] are value initialized, in this case with the default-ctor
  int Array1[1000] = {0}; //All elements are initialized with 0;
  int Array2[1000] = {1}; //Attention: only the first element is 1, the rest are 0;
  bool Array3[1000] = {}; //the braces can be empty too. All elements initialized with false
  int Array4[1000]; //no initializer. This is different from an empty {} initializer in that
  //the elements in this case are not value-initialized, but have indeterminate values 
  //(unless, of course, Array4 is a global array)
  int array[2] = {1, 2, 3, 4}; //ERROR, too many initializers
}

Now let's see how aggregate classes can be initialized with braces. Pretty much the same way. Instead of the array elements we will initialize the non-static data members in the order of their appearance in the class definition (they are all public by definition). If there are fewer initializers than members, the rest are value-initialized. If it is impossible to value-initialize one of the members which were not explicitly initialized, we get a compile-time error. If there are more initializers than necessary, we get a compile-time error as well.

struct X
{
  int i1;
  int i2;
};
struct Y
{
  char c;
  X x;
  int i[2];
  float f; 
protected:
  static double d;
private:
  void g(){}      
}; 

Y y = {'a', {10, 20}, {20, 30}};

In the above example y.c is initialized with 'a', y.x.i1 with 10, y.x.i2 with 20, y.i[0] with 20, y.i[1] with 30 and y.f is value-initialized, that is, initialized with 0.0. The protected static member d is not initialized at all, because it is static.

Aggregate unions are different in that you may initialize only their first member with braces. I think that if you are advanced enough in C++ to even consider using unions (their use may be very dangerous and must be thought of carefully), you could look up the rules for unions in the standard yourself :).

Now that we know what's special about aggregates, let's try to understand the restrictions on classes; that is, why they are there. We should understand that memberwise initialization with braces implies that the class is nothing more than the sum of its members. If a user-defined constructor is present, it means that the user needs to do some extra work to initialize the members therefore brace initialization would be incorrect. If virtual functions are present, it means that the objects of this class have (on most implementations) a pointer to the so-called vtable of the class, which is set in the constructor, so brace-initialization would be insufficient. You could figure out the rest of the restrictions in a similar manner as an exercise :).

So enough about the aggregates. Now we can define a stricter set of types, to wit, PODs

What are PODs and why they are special

Formal definition from the C++ standard (C++03 9 §4):

A POD-struct is an aggregate class that has no non-static data members of type non-POD-struct, non-POD-union (or array of such types) or reference, and has no user-defined copy assignment operator and no user-defined destructor. Similarly, a POD-union is an aggregate union that has no non-static data members of type non-POD-struct, non-POD-union (or array of such types) or reference, and has no user-defined copy assignment operator and no user-defined destructor. A POD class is a class that is either a POD-struct or a POD-union.

Wow, this one's tougher to parse, isn't it? :) Let's leave unions out (on the same grounds as above) and rephrase in a bit clearer way:

An aggregate class is called a POD if it has no user-defined copy-assignment operator and destructor and none of its nonstatic members is a non-POD class, array of non-POD, or a reference.

What does this definition imply? (Did I mention POD stands for Plain Old Data?)

  • All POD classes are aggregates, or, to put it the other way around, if a class is not an aggregate then it is sure not a POD
  • Classes, just like structs, can be PODs even though the standard term is POD-struct for both cases
  • Just like in the case of aggregates, it doesn't matter what static members the class has

Examples:

struct POD
{
  int x;
  char y;
  void f() {} //no harm if there's a function
  static std::vector<char> v; //static members do not matter
};

struct AggregateButNotPOD1
{
  int x;
  ~AggregateButNotPOD1() {} //user-defined destructor
};

struct AggregateButNotPOD2
{
  AggregateButNotPOD1 arrOfNonPod[3]; //array of non-POD class
};

POD-classes, POD-unions, scalar types, and arrays of such types are collectively called POD-types.
PODs are special in many ways. I'll provide just some examples.

  • POD-classes are the closest to C structs. Unlike them, PODs can have member functions and arbitrary static members, but neither of these two change the memory layout of the object. So if you want to write a more or less portable dynamic library that can be used from C and even .NET, you should try to make all your exported functions take and return only parameters of POD-types.

  • The lifetime of objects of non-POD class type begins when the constructor has finished and ends when the destructor has finished. For POD classes, the lifetime begins when storage for the object is occupied and finishes when that storage is released or reused.

  • For objects of POD types it is guaranteed by the standard that when you memcpy the contents of your object into an array of char or unsigned char, and then memcpy the contents back into your object, the object will hold its original value. Do note that there is no such guarantee for objects of non-POD types. Also, you can safely copy POD objects with memcpy. The following example assumes T is a POD-type:

     #define N sizeof(T)
     char buf[N];
     T obj; // obj initialized to its original value
     memcpy(buf, &obj, N); // between these two calls to memcpy,
     // obj might be modified
     memcpy(&obj, buf, N); // at this point, each subobject of obj of scalar type
     // holds its original value
    
  • goto statement. As you may know, it is illegal (the compiler should issue an error) to make a jump via goto from a point where some variable was not yet in scope to a point where it is already in scope. This restriction applies only if the variable is of non-POD type. In the following example f() is ill-formed whereas g() is well-formed. Note that Microsoft's compiler is too liberal with this rule—it just issues a warning in both cases.

     int f()
     {
       struct NonPOD {NonPOD() {}};
       goto label;
       NonPOD x;
     label:
       return 0;
     }
    
     int g()
     {
       struct POD {int i; char c;};
       goto label;
       POD x;
     label:
       return 0;
     }
    
  • It is guaranteed that there will be no padding in the beginning of a POD object. In other words, if a POD-class A's first member is of type T, you can safely reinterpret_cast from A* to T* and get the pointer to the first member and vice versa.

The list goes on and on…

Conclusion

It is important to understand what exactly a POD is because many language features, as you see, behave differently for them.

Luciana answered 14/11, 2010 at 15:36 Comment(25)
Nice answer. Comments: "If the default constructor is implicitly defined then all nonstatic members are recursively value-initialized." and "Value-initialization for a non-aggregate class can fail if, for example, the class has no appropriate default constructor." is not correct: Value initialization of a class with an implicitly-declared default constructor does not necessiate an implicitly-defined default constructor. Thus, given (insert private: as appropriate): struct A { int const a; }; then A() is well-formed, even if A's default constructor definition would be ill-formed.Suter
This isn't really FAQ, it's far too long. Also, please get your acronyms defined in the first paragraph of the document if possible instead of 2/3 of the way through.Killoran
Armen, the key to format code in bullets is to indent it twice - once for being a bullet paragraph and once for being code. Fixed that for ya. :)Botzow
@Kev: If you manage to pack the same information into a shorter answer, we'd all happily up-vote it!Botzow
"An array is an aggregate even it is an array of non-aggregate class type.", same is true about classes with non-aggregate (non-static) data members, ala it suffices the usual conditions for aggregates. I'm not sure, but I think making a separate answer about aggregate-initialization would work too. That alone is sufficient material for a single answer, it seems (stuff like brace elision and why it can be dangerous can also then be mentioned).Suter
@Johannes: OK, but I find it difficult to rephrase. If you would take time to correct the sentence, I'd be grateful. I mean the declared/defined partLuciana
@Botzow - This maybe could have been broken up into three or four separate articles with links to each other where relevant? At the moment it's a bit of a wall of text.Killoran
@Kev: We're all still heavily experimenting with this newly born idea, and at least I am far from being fixated on a specific format. Your idea seems good to me, so assuming Armen wouldn't mind (Armen?) why don't you go ahead and just do it?Botzow
@sbi, @Kev: If you want to split the article into two then I do mind, but apparently that does not matter. On the other hand, I agree that shorter is better, and if I write other answers like this I'll take that into considerationLuciana
@Armen also do note you can do multiple answers to the same question. Each answer could contain part of the solution to the question. Screw that accepted-mark thing, in my opinion :)Suter
You asked to point out an errors, even typos. :) "I think that if you are advances". Great article, thank you very much!Dealing
i'm not a big shot c/c++ developer right now, just beginning to learn the finer things to advance as a serious programmer (for a MNC in which i got placed). Should i be worried about this right now? all this POD and aggregates stuff? (I could understand aggregates but PODs left me scratching my head to be honest).Nicoline
Great article! But with regards to "POD's are special in an extremely many ways" is this fully detailed anywhere? I have searched all over and this is the most comprehensive list I've found but I'm curious about what else there might be.Siegbahn
Answer is great. I'm still revisit this post for some-times. By the way about warnings for Visual Studio. "goto statement" for pod come with ignorance to MSVC compiler as You have mentioned. But for switch/case statement it generates compilation error. Based on it concept I have made some test-pod-checker: https://mcmap.net/q/17548/-test-for-quot-pod-ness-quot-in-c-c-11/…Megargee
Nice post! I've a doubt about it -- if this is not the best place to ask it, please let me know. On the POD definition it is stated that "Similarly, a POD-union is an aggregate union that has no .... types) or reference, and has ...". The reference on this context means only references or it also means pointers? I.e., it is literal or it the same of class: "In C++, the term class refers to all classes, structs, and unions."?Deflected
I'd change the last example in the "aggregates" section, so that there would be some non-trivial member in the aggregate, say std::stringSmudge
Question: What about pointers? Is your struct POD still a POD if I add int * pi; or char * acName; or even MyCustomClass * pObject;?Furbelow
@Fabian: Yes, it is still a PODLuciana
Very good explanation indeed! I would recommend to clarify that aggregates can have public, protected and private functions, static or non-static, as well as public, protected and private static data. This became clear to me in the non-POD example with a public function.Furbelow
In the array example - B b1[3] = {B()};, how come B is an aggregate class, given that it has a user-defined constructor?Ywis
@SexyBeast: B isn't an aggregate class, but an array of B is an aggregate. An array of anything is an aggregateLuciana
In the bullet point that begins with "The lifetime of objects of non-POD class type begins when the constructor has finished and ends when the destructor has finished." the last part should instead say "when the destructor starts."Counterweigh
> The lifetime of objects of non-POD class type [...] ends when the destructor has finished I'm pretty sure the lifetime ends when the destructor starts...Zeba
@ArmenTsirunyan "When an object of class type with a user-declared default constructor is value-initialized its default constructor is called. If the default constructor is implicitly defined then all nonstatic members are recursively value-initialized. " Here what is meant by nonstatic members are recursively value-initialized. can you elaborate please ?Isabea
@ArmenTsirunyan 2 "The lifetime of objects of non-POD class type begins when the constructor has finished and ends when the destructor has finished. For POD classes, the lifetime begins when storage for the object is occupied and finishes when that storage is released or reused." So what's the role of compiler generated constructor and destructor ?Isabea
C
523

What changes for C++11?

Aggregates

The standard definition of an aggregate has changed slightly, but it's still pretty much the same:

An aggregate is an array or a class (Clause 9) with no user-provided constructors (12.1), no brace-or-equal-initializers for non-static data members (9.2), no private or protected non-static data members (Clause 11), no base classes (Clause 10), and no virtual functions (10.3).

Ok, what changed?

  1. Previously, an aggregate could have no user-declared constructors, but now it can't have user-provided constructors. Is there a difference? Yes, there is, because now you can declare constructors and default them:

    struct Aggregate {
        Aggregate() = default; // asks the compiler to generate the default implementation
    };
    

    This is still an aggregate because a constructor (or any special member function) that is defaulted on the first declaration is not user-provided.

  2. Now an aggregate cannot have any brace-or-equal-initializers for non-static data members. What does this mean? Well, this is just because with this new standard, we can initialize members directly in the class like this:

    struct NotAggregate {
        int x = 5; // valid in C++11
        std::vector<int> s{1,2,3}; // also valid
    };
    

    Using this feature makes the class no longer an aggregate because it's basically equivalent to providing your own default constructor.

So, what is an aggregate didn't change much at all. It's still the same basic idea, adapted to the new features.

What about PODs?

PODs went through a lot of changes. Lots of previous rules about PODs were relaxed in this new standard, and the way the definition is provided in the standard was radically changed.

The idea of a POD is to capture basically two distinct properties:

  1. It supports static initialization, and
  2. Compiling a POD in C++ gives you the same memory layout as a struct compiled in C.

Because of this, the definition has been split into two distinct concepts: trivial classes and standard-layout classes, because these are more useful than POD. The standard now rarely uses the term POD, preferring the more specific trivial and standard-layout concepts.

The new definition basically says that a POD is a class that is both trivial and has standard-layout, and this property must hold recursively for all non-static data members:

A POD struct is a non-union class that is both a trivial class and a standard-layout class, and has no non-static data members of type non-POD struct, non-POD union (or array of such types). Similarly, a POD union is a union that is both a trivial class and a standard layout class, and has no non-static data members of type non-POD struct, non-POD union (or array of such types). A POD class is a class that is either a POD struct or a POD union.

Let's go over each of these two properties in detail separately.

Trivial classes

Trivial is the first property mentioned above: trivial classes support static initialization. If a class is trivially copyable (a superset of trivial classes), it is ok to copy its representation over the place with things like memcpy and expect the result to be the same.

The standard defines a trivial class as follows:

A trivially copyable class is a class that:

— has no non-trivial copy constructors (12.8),

— has no non-trivial move constructors (12.8),

— has no non-trivial copy assignment operators (13.5.3, 12.8),

— has no non-trivial move assignment operators (13.5.3, 12.8), and

— has a trivial destructor (12.4).

A trivial class is a class that has a trivial default constructor (12.1) and is trivially copyable.

[ Note: In particular, a trivially copyable or trivial class does not have virtual functions or virtual base classes.—end note ]

So, what are all those trivial and non-trivial things?

A copy/move constructor for class X is trivial if it is not user-provided and if

— class X has no virtual functions (10.3) and no virtual base classes (10.1), and

— the constructor selected to copy/move each direct base class subobject is trivial, and

— for each non-static data member of X that is of class type (or array thereof), the constructor selected to copy/move that member is trivial;

otherwise the copy/move constructor is non-trivial.

Basically this means that a copy or move constructor is trivial if it is not user-provided, the class has nothing virtual in it, and this property holds recursively for all the members of the class and for the base class.

The definition of a trivial copy/move assignment operator is very similar, simply replacing the word "constructor" with "assignment operator".

A trivial destructor also has a similar definition, with the added constraint that it can't be virtual.

And yet another similar rule exists for trivial default constructors, with the addition that a default constructor is not-trivial if the class has non-static data members with brace-or-equal-initializers, which we've seen above.

Here are some examples to clear everything up:

// empty classes are trivial
struct Trivial1 {};

// all special members are implicit
struct Trivial2 {
    int x;
};

struct Trivial3 : Trivial2 { // base class is trivial
    Trivial3() = default; // not a user-provided ctor
    int y;
};

struct Trivial4 {
public:
    int a;
private: // no restrictions on access modifiers
    int b;
};

struct Trivial5 {
    Trivial1 a;
    Trivial2 b;
    Trivial3 c;
    Trivial4 d;
};

struct Trivial6 {
    Trivial2 a[23];
};

struct Trivial7 {
    Trivial6 c;
    void f(); // it's okay to have non-virtual functions
};

struct Trivial8 {
     int x;
     static NonTrivial1 y; // no restrictions on static members
};

struct Trivial9 {
     Trivial9() = default; // not user-provided
      // a regular constructor is okay because we still have default ctor
     Trivial9(int x) : x(x) {};
     int x;
};

struct NonTrivial1 : Trivial3 {
    virtual void f(); // virtual members make non-trivial ctors
};

struct NonTrivial2 {
    NonTrivial2() : z(42) {} // user-provided ctor
    int z;
};

struct NonTrivial3 {
    NonTrivial3(); // user-provided ctor
    int w;
};
NonTrivial3::NonTrivial3() = default; // defaulted but not on first declaration
                                      // still counts as user-provided
struct NonTrivial5 {
    virtual ~NonTrivial5(); // virtual destructors are not trivial
};

Standard-layout

Standard-layout is the second property. The standard mentions that these are useful for communicating with other languages, and that's because a standard-layout class has the same memory layout of the equivalent C struct or union.

This is another property that must hold recursively for members and all base classes. And as usual, no virtual functions or virtual base classes are allowed. That would make the layout incompatible with C.

A relaxed rule here is that standard-layout classes must have all non-static data members with the same access control. Previously these had to be all public, but now you can make them private or protected, as long as they are all private or all protected.

When using inheritance, only one class in the whole inheritance tree can have non-static data members, and the first non-static data member cannot be of a base class type (this could break aliasing rules), otherwise, it's not a standard-layout class.

This is how the definition goes in the standard text:

A standard-layout class is a class that:

— has no non-static data members of type non-standard-layout class (or array of such types) or reference,

— has no virtual functions (10.3) and no virtual base classes (10.1),

— has the same access control (Clause 11) for all non-static data members,

— has no non-standard-layout base classes,

— either has no non-static data members in the most derived class and at most one base class with non-static data members, or has no base classes with non-static data members, and

— has no base classes of the same type as the first non-static data member.

A standard-layout struct is a standard-layout class defined with the class-key struct or the class-key class.

A standard-layout union is a standard-layout class defined with the class-key union.

[ Note: Standard-layout classes are useful for communicating with code written in other programming languages. Their layout is specified in 9.2.—end note ]

And let's see a few examples.

// empty classes have standard-layout
struct StandardLayout1 {};

struct StandardLayout2 {
    int x;
};

struct StandardLayout3 {
private: // both are private, so it's ok
    int x;
    int y;
};

struct StandardLayout4 : StandardLayout1 {
    int x;
    int y;

    void f(); // perfectly fine to have non-virtual functions
};

struct StandardLayout5 : StandardLayout1 {
    int x;
    StandardLayout1 y; // can have members of base type if they're not the first
};

struct StandardLayout6 : StandardLayout1, StandardLayout5 {
    // can use multiple inheritance as long only
    // one class in the hierarchy has non-static data members
};

struct StandardLayout7 {
    int x;
    int y;
    StandardLayout7(int x, int y) : x(x), y(y) {} // user-provided ctors are ok
};

struct StandardLayout8 {
public:
    StandardLayout8(int x) : x(x) {} // user-provided ctors are ok
// ok to have non-static data members and other members with different access
private:
    int x;
};

struct StandardLayout9 {
    int x;
    static NonStandardLayout1 y; // no restrictions on static members
};

struct NonStandardLayout1 {
    virtual f(); // cannot have virtual functions
};

struct NonStandardLayout2 {
    NonStandardLayout1 X; // has non-standard-layout member
};

struct NonStandardLayout3 : StandardLayout1 {
    StandardLayout1 x; // first member cannot be of the same type as base
};

struct NonStandardLayout4 : StandardLayout3 {
    int z; // more than one class has non-static data members
};

struct NonStandardLayout5 : NonStandardLayout3 {}; // has a non-standard-layout base class

Conclusion

With these new rules a lot more types can be PODs now. And even if a type is not POD, we can take advantage of some of the POD properties separately (if it is only one of trivial or standard-layout).

The standard library has traits to test these properties in the header <type_traits>:

template <typename T>
struct std::is_pod;
template <typename T>
struct std::is_trivial;
template <typename T>
struct std::is_trivially_copyable;
template <typename T>
struct std::is_standard_layout;
Consult answered 25/8, 2011 at 11:48 Comment(12)
can you please elaborate following rules: a) standard-layout classes must have all non-static data members with the same access control; b) only one class in the whole inheritance tree can have non-static data members, and the first non-static data member cannot be of a base class type (this could break aliasing rules). Especially what are the reasons for them? For the later rule, can you provide an example of breaking aliasing?Ivaivah
@AndyT: See my answer. I tried to answer to the best of my knowledge.Kisor
Might want to update this for C++14, which removed the "no brace-or-equal-initializers" requirement for aggregates.Trescott
@Trescott thanks for heads-up. I'll look up those changes soon and update it.Consult
For user-declared constructors, it's also useful when user declare a constructor as private without providing definition.Connor
Some added examples of POD which share properties of triviality and standard layout would help :)Gonsalez
Please update this awesome explanation for corresponding to the deprecation of POD in c++20 draft. 48225673Cephalonia
Regarding the aliasing: There's a C++ layout rule that if a class C has an (empty) base X, and the first data member of C is type X, then that first member can't be at the same offset as the base X; it gets a dummy padding byte ahead of it, if needed to avoid that. Having two instances of X (or subclass) at the same address could break things that need to distinguish different instances via their addresses (an empty instance has nothing else to distinguish it...). In any case the need to put in that padding byte breaks 'layout compatible'.Spartacus
...a POD is a class that is both trivial and has standard-layout... It appears to me that if a class is trivial, all non-static data members must be trivial, and if a class has standard-layout, that all non-static data members must have standard-layout. Is there any reason to specify ...and this property must hold recursively for all non-static data members? More specifically, is it possible for a class to be trivial and have standard-layout, but not be a POD, or for a class to be non-trivial or have non-standard-layout, but still be a POD?Sympathize
What happened to the POD requirement that they couldn't contain any pointer-to-member nonstatic data members?Undecagon
All POD classes are aggregates, or, to put it the other way around, if a class is not an aggregate then it is sure not a POD - I believe this doesn't hold true anymore for C++11 right?Hiss
s/representation over the place/representation all over the place/ ? (Correct use of idiom - merriam-webster.com/dictionary/over%20the%20place)Hourigan
S
130

What has changed for C++14

We can refer to the Draft C++14 standard for reference.

Aggregates

This is covered in section 8.5.1 Aggregates which gives us the following definition:

An aggregate is an array or a class (Clause 9) with no user-provided constructors (12.1), no private or protected non-static data members (Clause 11), no base classes (Clause 10), and no virtual functions (10.3).

The only change is now adding in-class member initializers does not make a class a non-aggregate. So the following example from C++11 aggregate initialization for classes with member in-place initializers:

struct A
{
  int a = 3;
  int b = 3;
};

was not an aggregate in C++11 but it is in C++14. This change is covered in N3605: Member initializers and aggregates, which has the following abstract:

Bjarne Stroustrup and Richard Smith raised an issue about aggregate initialization and member-initializers not working together. This paper proposes to fix the issue by adopting Smith's proposed wording that removes a restriction that aggregates can't have member-initializers.

POD stays the same

The definition for POD(plain old data) struct is covered in section 9 Classes which says:

A POD struct110 is a non-union class that is both a trivial class and a standard-layout class, and has no non-static data members of type non-POD struct, non-POD union (or array of such types). Similarly, a POD union is a union that is both a trivial class and a standard-layout class, and has no non-static data members of type non-POD struct, non-POD union (or array of such types). A POD class is a class that is either a POD struct or a POD union.

which is the same wording as C++11.

Standard-Layout Changes for C++14

As noted in the comments pod relies on the definition of standard-layout and that did change for C++14 but this was via defect reports that were applied to C++14 after the fact.

There were three DRs:

So standard-layout went from this Pre C++14:

A standard-layout class is a class that:

  • (7.1) has no non-static data members of type non-standard-layout class (or array of such types) or reference,
  • (7.2) has no virtual functions ([class.virtual]) and no virtual base classes ([class.mi]),
  • (7.3) has the same access control (Clause [class.access]) for all non-static data members,
  • (7.4) has no non-standard-layout base classes,
  • (7.5) either has no non-static data members in the most derived class and at most one base class with non-static data members, or has no base classes with non-static data members, and
  • (7.6) has no base classes of the same type as the first non-static data member.109

To this in C++14:

A class S is a standard-layout class if it:

  • (3.1) has no non-static data members of type non-standard-layout class (or array of such types) or reference,
  • (3.2) has no virtual functions and no virtual base classes,
  • (3.3) has the same access control for all non-static data members,
  • (3.4) has no non-standard-layout base classes,
  • (3.5) has at most one base class subobject of any given type,
  • (3.6) has all non-static data members and bit-fields in the class and its base classes first declared in the same class, and
  • (3.7) has no element of the set M(S) of types as a base class, where for any type X, M(X) is defined as follows.104 [ Note: M(X) is the set of the types of all non-base-class subobjects that may be at a zero offset in X. — end note  ]
    • (3.7.1) If X is a non-union class type with no (possibly inherited) non-static data members, the set M(X) is empty.
    • (3.7.2) If X is a non-union class type with a non-static data member of type X0 that is either of zero size or is the first non-static data member of X (where said member may be an anonymous union), the set M(X) consists of X0 and the elements of M(X0).
    • (3.7.3) If X is a union type, the set M(X) is the union of all M(Ui) and the set containing all Ui, where each Ui is the type of the ith non-static data member of X.
    • (3.7.4) If X is an array type with element type Xe, the set M(X) consists of Xe and the elements of M(Xe).
    • (3.7.5) If X is a non-class, non-array type, the set M(X) is empty.
Saintly answered 16/12, 2014 at 18:21 Comment(5)
There is a proposal to allow aggregates to have a base class as long as it is default constructable see N4404Saintly
while POD might stay the same, C++14 StandardLayoutType, which is a requirement for POD, has changed according to cppref: en.cppreference.com/w/cpp/named_req/StandardLayoutTypeCavetto
@CiroSantilli新疆改造中心六四事件法轮功 thank you, I don't know how I missed those, I will try to update over the next couple of days.Saintly
Let me know if you can come up with an example that is POD in C++14 but not in C++11 :-) I've started a detailed list of examples at: https://mcmap.net/q/17069/-what-are-pod-types-in-c-duplicate/…Cavetto
@CiroSantilli新疆改造中心六四事件法轮功 so what happened here is if we look at the standard-layout description in C++11 and C++14 they match. These changes where applied via defect reports back to C++14. So when I wrote this originally it was correct :-pSaintly
B
69

Changes in C++17

Download the C++17 International Standard final draft here.

Aggregates

C++17 expands and enhances aggregates and aggregate initialization. The standard library also now includes an std::is_aggregate type trait class. Here is the formal definition from section 11.6.1.1 and 11.6.1.2 (internal references elided):

An aggregate is an array or a class with
— no user-provided, explicit, or inherited constructors,
— no private or protected non-static data members,
— no virtual functions, and
— no virtual, private, or protected base classes.
[ Note: Aggregate initialization does not allow accessing protected and private base class’ members or constructors. —end note ]
The elements of an aggregate are:
— for an array, the array elements in increasing subscript order, or
— for a class, the direct base classes in declaration order, followed by the direct non-static data members that are not members of an anonymous union, in declaration order.

What changed?

  1. Aggregates can now have public, non-virtual base classes. Furthermore, it is not a requirement that base classes be aggregates. If they are not aggregates, they are list-initialized.
struct B1 // not a aggregate
{
    int i1;
    B1(int a) : i1(a) { }
};
struct B2
{
    int i2;
    B2() = default;
};
struct M // not an aggregate
{
    int m;
    M(int a) : m(a) { }
};
struct C : B1, B2
{
    int j;
    M m;
    C() = default;
};
C c { { 1 }, { 2 }, 3, { 4 } };
cout
    << "is C aggregate?: " << (std::is_aggregate<C>::value ? 'Y' : 'N')
    << " i1: " << c.i1 << " i2: " << c.i2
    << " j: " << c.j << " m.m: " << c.m.m << endl;

//stdout: is C aggregate?: Y, i1=1 i2=2 j=3 m.m=4
  1. Explicit defaulted constructors are disallowed
struct D // not an aggregate
{
    int i = 0;
    D() = default;
    explicit D(D const&) = default;
};
  1. Inheriting constructors are disallowed
struct B1
{
    int i1;
    B1() : i1(0) { }
};
struct C : B1 // not an aggregate
{
    using B1::B1;
};


Trivial Classes

The definition of trivial class was reworked in C++17 to address several defects that were not addressed in C++14. The changes were technical in nature. Here is the new definition at 12.0.6 (internal references elided):

A trivially copyable class is a class:
— where each copy constructor, move constructor, copy assignment operator, and move assignment operator is either deleted or trivial,
— that has at least one non-deleted copy constructor, move constructor, copy assignment operator, or move assignment operator, and
— that has a trivial, non-deleted destructor.
A trivial class is a class that is trivially copyable and has one or more default constructors, all of which are either trivial or deleted and at least one of which is not deleted. [ Note: In particular, a trivially copyable or trivial class does not have virtual functions or virtual base classes.—end note ]

Changes:

  1. Under C++14, for a class to be trivial, the class could not have any copy/move constructor/assignment operators that were non-trivial. However, then an implicitly declared as defaulted constructor/operator could be non-trivial and yet defined as deleted because, for example, the class contained a subobject of class type that could not be copied/moved. The presence of such non-trivial, defined-as-deleted constructor/operator would cause the whole class to be non-trivial. A similar problem existed with destructors. C++17 clarifies that the presence of such constructor/operators does not cause the class to be non-trivially copyable, hence non-trivial, and that a trivially copyable class must have a trivial, non-deleted destructor. DR1734, DR1928
  2. C++14 allowed a trivially copyable class, hence a trivial class, to have every copy/move constructor/assignment operator declared as deleted. If such as class was also standard layout, it could, however, be legally copied/moved with std::memcpy. This was a semantic contradiction, because, by defining as deleted all constructor/assignment operators, the creator of the class clearly intended that the class could not be copied/moved, yet the class still met the definition of a trivially copyable class. Hence in C++17 we have a new clause stating that trivially copyable class must have at least one trivial, non-deleted (though not necessarily publicly accessible) copy/move constructor/assignment operator. See N4148, DR1734
  3. The third technical change concerns a similar problem with default constructors. Under C++14, a class could have trivial default constructors that were implicitly defined as deleted, yet still be a trivial class. The new definition clarifies that a trivial class must have a least one trivial, non-deleted default constructor. See DR1496

Standard-layout Classes

The definition of standard-layout was also reworked to address defect reports. Again the changes were technical in nature. Here is the text from the standard (12.0.7). As before, internal references are elided:

A class S is a standard-layout class if it:
— has no non-static data members of type non-standard-layout class (or array of such types) or reference,
— has no virtual functions and no virtual base classes,
— has the same access control for all non-static data members,
— has no non-standard-layout base classes,
— has at most one base class subobject of any given type,
— has all non-static data members and bit-fields in the class and its base classes first declared in the same class, and
— has no element of the set M(S) of types (defined below) as a base class.108
M(X) is defined as follows:
— If X is a non-union class type with no (possibly inherited) non-static data members, the set M(X) is empty.
— If X is a non-union class type whose first non-static data member has type X0 (where said member may be an anonymous union), the set M(X) consists of X0 and the elements of M(X0).
— If X is a union type, the set M(X) is the union of all M(Ui) and the set containing all Ui, where each Ui is the type of the ith non-static data member of X.
— If X is an array type with element type Xe, the set M(X) consists of Xe and the elements of M(Xe).
— If X is a non-class, non-array type, the set M(X) is empty.
[ Note: M(X) is the set of the types of all non-base-class subobjects that are guaranteed in a standard-layout class to be at a zero offset in X. —end note ]
[ Example:

struct B { int i; }; // standard-layout class
struct C : B { }; // standard-layout class
struct D : C { }; // standard-layout class
struct E : D { char : 4; }; // not a standard-layout class
struct Q {};
struct S : Q { };
struct T : Q { };
struct U : S, T { }; // not a standard-layout class
—end example ]
108) This ensures that two subobjects that have the same class type and that belong to the same most derived object are not allocated at the same address.

Changes:

  1. Clarified that the requirement that only one class in the derivation tree "has" non-static data members refers to a class where such data members are first declared, not classes where they may be inherited, and extended this requirement to non-static bit fields. Also clarified that a standard-layout class "has at most one base class subobject of any given type." See DR1813, DR1881
  2. The definition of standard-layout has never allowed the type of any base class to be the same type as the first non-static data member. It is to avoid a situation where a data member at offset zero has the same type as any base class. The C++17 standard provides a more rigorous, recursive definition of "the set of the types of all non-base-class subobjects that are guaranteed in a standard-layout class to be at a zero offset" so as to prohibit such types from being the type of any base class. See DR1672, DR2120.

Note: The C++ standards committee intended the above changes based on defect reports to apply to C++14, though the new language is not in the published C++14 standard. It is in the C++17 standard.

Bismuthinite answered 10/6, 2018 at 17:6 Comment(6)
Note I just updated my answer the standard layout changes defects have CD4 status which means that they are actually applied to C++14. Which was why my answer did not include them b/c this happened after I wrote my answer.Saintly
Note, I started a bounty on this question.Saintly
Thanks @ShafikYaghmour. I will review the defect reports status and modify my answer accordingly.Bismuthinite
@ShafikYaghmour, After some review of the C++14 process and it appears to me that, while these DRs were "accepted" at the June 2014 Rapperswil meeting the language from the February 2014 Issaquah meeting was what became C++14. See isocpp.org/blog/2014/07/trip-report-summer-iso-c-meeting "in accordance with ISO rules we didn’t formally approve any edits to the C++ working paper." Am I missing something?Bismuthinite
They have 'CD4' status which means they should apply in C++14 mode.Saintly
@ShafikYaghmour. I added a note about the defect reports at the end of my answer. Thanks for the points!Bismuthinite
K
55

POD in C++11 was basically split into two different axes here: triviality and layout. Triviality is about the relationship between an object's conceptual value and the bits of data within its storage. Layout is about... well, the layout of an object's subobjects. Only class types have layout, while all types have triviality relationships.

So here is what the triviality axis is about:

  1. Non-trivially copyable: The value of objects of such types may be more than just the binary data that are stored directly within the object.

    For example, unique_ptr<T> stores a T*; that is the totality of the binary data within the object. But that's not the totality of the value of a unique_ptr<T>. A unique_ptr<T> stores either a nullptr or a pointer to an object whose lifetime is managed by the unique_ptr<T> instance. That management is part of the value of a unique_ptr<T>. And that value is not part of the binary data of the object; it is created by the various member functions of that object.

    For example, to assign nullptr to a unique_ptr<T> is to do more than just change the bits stored in the object. Such an assignment must destroy any object managed by the unique_ptr. To manipulate the internal storage of a unique_ptr without going through its member functions would damage this mechanism, to change its internal T* without destroying the object it currently manages, would violate the conceptual value that the object possesses.

  2. Trivially copyable: The value of such objects are exactly and only the contents of their binary storage. This is what makes it reasonable to allow copying that binary storage to be equivalent to copying the object itself.

    The specific rules that define trivial copyability (trivial destructor, trivial/deleted copy/move constructors/assignment) are what is required for a type to be binary-value-only. An object's destructor can participate in defining the "value" of an object, as in the case with unique_ptr. If that destructor is trivial, then it doesn't participate in defining the object's value.

    Specialized copy/move operations also can participate in an object's value. unique_ptr's move constructor modifies the source of the move operation by null-ing it out. This is what ensures that the value of a unique_ptr is unique. Trivial copy/move operations mean that such object value shenanigans are not being played, so the object's value can only be the binary data it stores.

  3. Trivial: This object is considered to have a functional value for any bits that it stores. Trivially copyable defines the meaning of the data store of an object as being just that data. But such types can still control how data gets there (to some extent). Such a type can have default member initializers and/or a default constructor that ensures that a particular member always has a particular value. And thus, the conceptual value of the object can be restricted to a subset of the binary data that it could store.

    Performing default initialization on a type that has a trivial default constructor will leave that object with completely uninitialized values. As such, a type with a trivial default constructor is logically valid with any binary data in its data storage.

The layout axis is really quite simple. Compilers are given a lot of leeway in deciding how the subobjects of a class are stored within the class's storage. However, there are some cases where this leeway is not necessary, and having more rigid ordering guarantees is useful.

Such types are standard layout types. And the C++ standard doesn't even really do much with saying what that layout is specifically. It basically says three things about standard layout types:

  1. The first subobject is at the same address as the object itself.

  2. You can use offsetof to get a byte offset from the outer object to one of its member subobjects.

  3. unions get to play some games with accessing subobjects through an inactive member of a union if the active member is (at least partially) using the same layout as the inactive one being accessed.

Compilers generally permit standard layout objects to map to struct types with the same members in C. But there is no statement of that in the C++ standard; that's just what compilers feel like doing.

POD is basically a useless term at this point. It is just the intersection of trivial copyability (the value is only its binary data) and standard layout (the order of its subobjects is more well-defined). One can infer from such things that the type is C-like and could map to similar C objects. But the standard has no statements to that effect.


can you please elaborate following rules:

I'll try:

a) standard-layout classes must have all non-static data members with the same access control

That's simple: all non-static data members must all be public, private, or protected. You can't have some public and some private.

The reasoning for them goes to the reasoning for having a distinction between "standard layout" and "not standard layout" at all. Namely, to give the compiler the freedom to choose how to put things into memory. It's not just about vtable pointers.

Back when they standardized C++ in 98, they had to basically predict how people would implement it. While they had quite a bit of implementation experience with various flavors of C++, they weren't certain about things. So they decided to be cautious: give the compilers as much freedom as possible.

That's why the definition of POD in C++98 is so strict. It gave C++ compilers great latitude on member layout for most classes. Basically, POD types were intended to be special cases, something you specifically wrote for a reason.

When C++11 was being worked on, they had a lot more experience with compilers. And they realized that... C++ compiler writers are really lazy. They had all this freedom, but they didn't do anything with it.

The rules of standard layout are more or less codifying common practice: most compilers didn't really have to change much if anything at all to implement them (outside of maybe some stuff for the corresponding type traits).

Now, when it came to public/private, things are different. The freedom to reorder which members are public vs. private actually can matter to the compiler, particularly in debugging builds. And since the point of standard layout is that there is compatibility with other languages, you can't have the layout be different in debug vs. release.

Then there's the fact that it doesn't really hurt the user. If you're making an encapsulated class, odds are good that all of your data members will be private anyway. You generally don't expose public data members on fully encapsulated types. So this would only be a problem for those few users who do want to do that, who want that division.

So it's no big loss.

b) only one class in the whole inheritance tree can have non-static data members,

The reason for this one comes back to why they standardized standard layout again: common practice.

There's no common practice when it comes to having two members of an inheritance tree that actually store things. Some put the base class before the derived, others do it the other way. Which way do you order the members if they come from two base classes? And so on. Compilers diverge greatly on these questions.

Also, thanks to the zero/one/infinity rule, once you say you can have two classes with members, you can say as many as you want. This requires adding a lot of layout rules for how to handle this. You have to say how multiple inheritance works, which classes put their data before other classes, etc. That's a lot of rules, for very little material gain.

You can't make everything that doesn't have virtual functions and a default constructor standard layout.

and the first non-static data member cannot be of a base class type (this could break aliasing rules).

I can't really speak to this one. I'm not educated enough in C++'s aliasing rules to really understand it. But it has something to do with the fact that the base member will share the same address as the base class itself. That is:

struct Base {};
struct Derived : Base { Base b; };

Derived d;
static_cast<Base*>(&d) == &d.b;

And that's probably against C++'s aliasing rules. In some way.

However, consider this: how useful could having the ability to do this ever actually be? Since only one class can have non-static data members, then Derived must be that class (since it has a Base as a member). So Base must be empty (of data). And if Base is empty, as well as a base class... why have a data member of it at all?

Since Base is empty, it has no state. So any non-static member functions will do what they do based on their parameters, not their this pointer.

So again: no big loss.

Kisor answered 28/2, 2012 at 23:26 Comment(4)
Thanks for explanation, it helps a lot. Probably despite static_cast<Base*>(&d) and &d.b are the same Base* type, they points to different things thus breaking aliasing rule. Please correct me.Ivaivah
and, why if only one class can have non-static data members, then Derived must be that class?Ivaivah
@AndyT: In order for Derived's first member to be its base class, it must have two things: a base class, and a member. And since only one class in the hierarchy can have members (and still be standard-layout), this means that its base class cannot have members.Kisor
@AndyT, Yeah, you're essentially correct, IME, about the aliasing rule. Two distinct instances of the same type are required to have distinct memory addresses. (This allows object identity tracking with memory addresses.) The base object and the first derived member are different instances, so they must have different addresses, which forces padding to be added, affecting the class's layout. If they were of different types, it wouldn't matter; objects with different types are allowed to have the same address (a class and its first data member, for example).Nitrite
S
40

What changes in

Following the rest of the clear theme of this question, the meaning and use of aggregates continues to change with every standard. There are several key changes on the horizon.

Types with user-declared constructors P1008

In C++17, this type is still an aggregate:

struct X {
    X() = delete;
};

And hence, X{} still compiles because that is aggregate initialization - not a constructor invocation. See also: When is a private constructor not a private constructor?

In C++20, the restriction will change from requiring:

no user-provided, explicit, or inherited constructors

to

no user-declared or inherited constructors

This has been adopted into the C++20 working draft. Neither the X here nor the C in the linked question will be aggregates in C++20.

This also makes for a yo-yo effect with the following example:

class A { protected: A() { }; };
struct B : A { B() = default; };
auto x = B{};

In C++11/14, B was not an aggregate due to the base class, so B{} performs value-initialization which calls B::B() which calls A::A(), at a point where it is accessible. This was well-formed.

In C++17, B became an aggregate because base classes were allowed, which made B{} aggregate-initialization. This requires copy-list-initializing an A from {}, but from outside the context of B, where it is not accessible. In C++17, this is ill-formed (auto x = B(); would be fine though).

In C++20 now, because of the above rule change, B once again ceases to be an aggregate (not because of the base class, but because of the user-declared default constructor - even though it's defaulted). So we're back to going through B's constructor, and this snippet becomes well-formed.

Initializing aggregates from a parenthesized list of values P960

A common issue that comes up is wanting to use emplace()-style constructors with aggregates:

struct X { int a, b; };
std::vector<X> xs;
xs.emplace_back(1, 2); // error

This does not work, because emplace will try to effectively perform the initialization X(1, 2), which is not valid. The typical solution is to add a constructor to X, but with this proposal (currently working its way through Core), aggregates will effectively have synthesized constructors which do the right thing - and behave like regular constructors. The above code will compile as-is in C++20.

Class Template Argument Deduction (CTAD) for Aggregates P1021 (specifically P1816)

In C++17, this does not compile:

template <typename T>
struct Point {
    T x, y;
};

Point p{1, 2}; // error

Users would have to write their own deduction guide for all aggregate templates:

template <typename T> Point(T, T) -> Point<T>;

But as this is in some sense "the obvious thing" to do, and is basically just boilerplate, the language will do this for you. This example will compile in C++20 (without the need for the user-provided deduction guide).

Shannan answered 17/12, 2018 at 16:56 Comment(6)
Although I will upvote, it does feel a little early to be adding this, I don't know of anything major coming down that would change this before C++2x is done though.Saintly
@ShafikYaghmour Yeah, probably WAY too early. But given that SD was the deadline for new language features, these are the only two in-flight that I'm aware of - worst case I just block-delete one of these sections later? I just saw the question active with the bounty and thought it was a good time to chime in before I forget.Shannan
I understand, I have been tempted a couple of times for similar cases. I always worry something major will change and I will end up having to rewrite it.Saintly
@ShafikYaghmour Seems like nothing's going to change here :)Shannan
@NooneAtAll Yep, CTAD for aggregates is in C++20. Updated.Shannan
You should add a few words on designated initializersLibre

© 2022 - 2024 — McMap. All rights reserved.