Why do compilers (e.g. gcc) deal with the memory layout of derived classes in this way?

About

Asked 24/5, 2014 at 5:58 Answered 24/5, 2014 at 7:11

Here is my cpp code.

#include <iostream>
using namespace std;

class A {
public:
    int val;
    char a;
};

class B: public A {
public:
    char b;
};

class C: public B {
public:
    char c;
};

int main()
{
    cout << sizeof(A) << endl;
    cout << sizeof(B) << endl;
    cout << sizeof(C) << endl;

    return 0;
}

The output of the program (in gcc) is:

8
12
12

This output confuses me a lot.

I know that the alignment may be the reason why sizeof(A) equals to 8. (sizeof(int) + sizeof(char) + 3 bytes padding)

And I also guess that the expansion of sizeof(B) (sizeof(B) == sizeof(A) + sizeof(char) + 3 bytes padding) is to avoid overlap when copy occurs. (is that right?)

But what I really don't know why sizeof(B) is equal to sizeof(C).

Thanks a lot.

Refreshment answered 24/5, 2014 at 5:58 Comment(3)

The sizes, including the paddings, are in bytes, not in bits. – Pusan 24/5, 2014 at 6:32

Not related to runtime-debug prints, a VC++ compiler switch that ponies up the actual object-structural layout, vtables, virtual-bases, et-al, is incredibly educational. See this question for details on how it is done for that platform. I cannot say with experience whether something similar exists for g++, but I would be somewhat surprised if it did not. – Alumroot 24/5, 2014 at 7:1

It might be instructive to print the offsets of the variables with cout << "Offset of 'val': " << (int)(&((C*)0)->val) << " bytes.\n"; etc. – Lelia 24/5, 2014 at 7:43

Both GCC and Clang follow the Itanium C++ ABI document, which specifies:

... implementations may freely allocate objects in the tail padding of any class which would not have been POD in C++98

class A is POD, so the compiler cannot put stuff into its padding. class B isn't POD, so the compiler is free to re-use the padding within the base class layout for members of derived objects. The basic idea here was that the C++ class layout should mirror the equivalent C struct layout for POD types, but there is no limitation for other classes. Because the meaning of "POD" has changed multiple times, they explicitly use the definition from C++98.

EDIT: About the rationale. POD-types are very simple classes that could be implemented as struct in C. For those types the layout should be identical to the layout a C compiler would create. In particular they want to allow C-tools like memcpy for A. If char b; were within the padding of A, memcpy would destroy it.

Hebetate answered 24/5, 2014 at 7:11 Comment(14)

Thanks for your answer. But can you explain what is this rule made for ? – Refreshment 24/5, 2014 at 9:37

I added some text to better explain the rationale. Hope this helps. – Hebetate 24/5, 2014 at 9:52

What is the difference between B and C which makes C non-POD with respect to C++98? Both are publicly inheriting from a POD-class and containing only POD members. Or wait, is B already non-POD? In that case I just misunderstood the quote from the standard (in that case, some more context and clarification of your edit would be helpful). – Municipalize 24/5, 2014 at 9:54

@Jonas A is POD, as a result the padding within A must not be used. B is non-POD, so the padding within B is fair game for subsequent objects. But see the problem. Better now? – Hebetate 24/5, 2014 at 10:9

So B is not POD, yet the compiler creates a POD-like layout? (being a bit confused here) – Municipalize 24/5, 2014 at 10:10

@Jonas No, it has nothing to do with B, it's the A they want to protect. They want to allow ugly C-tools like memcpy for A. – Hebetate 24/5, 2014 at 10:17

Ah now I get it. I didn’t realize that for protecting the layout of A, you might want to keep the padding clear of data. Thanks for your effort to explain that to me! – Municipalize 24/5, 2014 at 10:18

@jonas: The rule is more general. The derived class can never change the layout of the base class. The consequences would be horrible. So c cannot change the offset of b either. Only the tail padding is usable. – Cyrilla 24/5, 2014 at 10:37

Yeah, that makes sense, @david.pfx. I just had a hard time to realize why it makes sense to even protect the tail padding of A here (so that in C, you can memcpy over the A-part of a B struct). – Municipalize 24/5, 2014 at 10:40

@Cyrilla Even more general: derived classes can change the layout of base classes only if there are virtual base classes within the mix. – Hebetate 24/5, 2014 at 10:43

@pentadecagon: I don't see how. The offset of each member within its class must be preserved regardless of subsequent derivations. Can you provide an example that shows otherwise? – Cyrilla 24/5, 2014 at 11:18

@Hebetate thank you very much for your great great answer. I learn a lot. :) – Refreshment 24/5, 2014 at 12:15

@pentadecagon: By layout I mean the offset and size of each member within its enclosing struct/class. That doesn't change, and is usually quite predictable, even if not a standard-layout-class. Your example shows something different: offset of an inherited member. For non-SLC base class that's UB and for virtual not easily predictable either. – Cyrilla 25/5, 2014 at 1:6

Re POD: actually the standard uses the term standard-layout-class. An SLC may have non-trivial constructors and certain operators, but otherwise behaves as the POD described here. POD is more restrictive. – Cyrilla 25/5, 2014 at 1:10

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags