Standard-layout and tail padding

Asked 18/12, 2018 at 16:32 Answered 18/12, 2018 at 18:41

Solved c++g++language-lawyer clang++standard-layout

David Hollman recently tweeted the following example (which I've slightly reduced):

struct FooBeforeBase {
    double d;
    bool b[4];
};

struct FooBefore : FooBeforeBase {
    float value;
};

static_assert(sizeof(FooBefore) > 16);

//----------------------------------------------------

struct FooAfterBase {
protected:
    double d;
public:  
    bool b[4];
};

struct FooAfter : FooAfterBase {
    float value;
};

static_assert(sizeof(FooAfter) == 16);

You can examine the layout in clang on godbolt and see that the reason the size changed is that in FooBefore, the member value is placed at offset 16 (maintaining a full alignment of 8 from FooBeforeBase) whereas in FooAfter, the member value is placed at offset 12 (effectively using FooAfterBase's tail-padding).

It is clear to me that FooBeforeBase is standard-layout, but FooAfterBase is not (because its non-static data members do not all have the same access control, [class.prop]/3). But what is it about FooBeforeBase's being standard-layout that requires this respect of padding bytes?

Both gcc and clang reuse FooAfterBase's padding, ending up with sizeof(FooAfter) == 16. But MSVC does not, ending up with 24. Is there a required layout per the standard and, if not, why do gcc and clang do what they do?

There is some confusion, so just to clear up:

FooBeforeBase is standard-layout
FooBefore is not (both it and a base class have non-static data members, similar to E in this example)
FooAfterBase is not (it has non-static data members of differing access)
FooAfter is not (for both of the above reasons)

Alurd answered 18/12, 2018 at 16:32 Comment(9)

Who says that this behavior is "required" by anything in the standard? It could simply be a manifestation of how the compiler goes about implementing things. – Ailey 18/12, 2018 at 17:23

@NicolBolas It may very well not be required. MSVC does not do this (its FooAfter is also 24 bytes), but gcc and clang do - and it seems like that's a conscious choice on their parts. – Alurd 18/12, 2018 at 17:32

"it seems like that's a conscious choice on their parts." What makes you say that? – Ailey 18/12, 2018 at 17:33

It is never required that there is no padding between class members. It may be required that there is padding. So the correct question is not which part of the standard requires gcc to reuse the end-padding, but what allows it to do so in the second case. Another question is whether something disallows such reuse in the first case. – Marrissa 18/12, 2018 at 17:53

I think the right way to phrase this question is just the more general: Why do gcc and clang do what they do here? – Alurd 18/12, 2018 at 17:56

@Barry: If that's the question you want answered, then it's a duplicate of the Q&A n.m linked to. They do it because the Itanium ABI says to do it. – Ailey 18/12, 2018 at 17:57

You're also changing the access control of members, which can lead to changes of layout. Once you mark some members as private or protected, the compiler is free to potentially use processor features to ensure that access at runtime. (Not sure that any do, though.) – Cotyledon 18/12, 2018 at 18:39

@NicolBolas I feel like the right answer to this question was split between multiple answers and comments on both of these questions, so I'm community-wiki-ing the answer here and marking the other as a duplicate of this one (... in two days). – Alurd 18/12, 2018 at 18:46

The answer to this question doesn't come from the standard but rather from the Itanium ABI (which is why gcc and clang have one behavior but msvc does something else). That ABI defines a layout, the relevant parts of which for the purposes of this question are:

For purposes internal to the specification, we also specify:

dsize(O): the data size of an object, which is the size of O without tail padding.

and

We ignore tail padding for PODs because an early version of the standard did not allow us to use it for anything else and because it sometimes permits faster copying of the type.

Where the placement of members other than virtual base classes is defined as:

Start at offset dsize(C), incremented if necessary for alignment to nvalign(D) for base classes or to align(D) for data members. Place D at this offset unless [... not relevant ...].

The term POD has disappeared from the C++ standard, but it means standard-layout and trivially copyable. In this question, FooBeforeBase is a POD. The Itanium ABI ignores tail padding - hence dsize(FooBeforeBase) is 16.

But FooAfterBase is not a POD (it is trivially copyable, but it is not standard-layout). As a result, tail padding is not ignored, so dsize(FooAfterBase) is just 12, and the float can go right there.

This has interesting consequences, as pointed out by Quuxplusone in a related answer, implementors also typically assume that tail padding isn't reused, which wreaks havoc on this example:

#include <algorithm>
#include <stdio.h>

struct A {
    int m_a;
};

struct B : A {
    int m_b1;
    char m_b2;
};

struct C : B {
    short m_c;
};

int main() {
    C c1 { 1, 2, 3, 4 };
    B& b1 = c1;
    B b2 { 5, 6, 7 };

    printf("before operator=: %d\n", int(c1.m_c));  // 4
    b1 = b2;
    printf("after operator=: %d\n", int(c1.m_c));  // 4

    printf("before std::copy: %d\n", int(c1.m_c));  // 4
    std::copy(&b2, &b2 + 1, &b1);
    printf("after std::copy: %d\n", int(c1.m_c));  // 64, or 0, or anything but 4
}

Here, = does the right thing (it does not override B's tail padding), but copy() has a library optimization that reduces to memmove() - which does not care about tail padding because it assumes it does not exist.

Alurd answered 18/12, 2018 at 16:32 Comment(2)

I think this answer is correct except for one thing: the working definition of POD in the ABI is not the C++11 definition of "trivial + standard-layout", but rather what POD was back in C++03, which is basically an aggregate + some other restrictions on members (see 8.5.1 and 9p4). An aggregate has only public members. A standard-layout type may have non-public members, as long as all have the same access level. So the upshot is that a class with only private numbers may be trivial copyable and standard-layout, hence POD in the C++11 sense, but not POD in the C++03 sense and ... – Milliken 22/1, 2020 at 21:34

... so it is eligible for re-use of padding as shown in this question. – Milliken 22/1, 2020 at 21:34

FooBefore derived;
FooBeforeBase src, &dst=derived;
....
memcpy(&dst, &src, sizeof(dst));

If the additional data member was placed in the hole, memcpy would have overwritten it.

As is correctly pointed out in comments, the standard doesn't require that this memcpy invocation should work. However the Itanium ABI is seemingly designed with this case in mind. Perhaps the ABI rules are specified this way in order to make mixed-language programming a bit more robust, or to preserve some kind of backwards compatibility.

Relevant ABI rules can be found here.

A related answer can be found here (this question might be a duplicate of that one).

Marrissa answered 18/12, 2018 at 17:14 Comment(3)

TriviallyCopyable does not work if you try to copy into a base class subobject. – Ailey 18/12, 2018 at 17:18

The next question: why does the Itanium ABI designed this way? :) – Eventuality 18/12, 2018 at 18:2

@Eventuality I don't know but my guess is up there in the answer. – Marrissa 18/12, 2018 at 18:9

-1

Here is a concrete case which demonstrates why the second case cannot reuse the padding:

union bob {
  FooBeforeBase a;
  FooBefore b;
};

bob.b.value = 3.14;
memset( &bob.a, 0, sizeof(bob.a) );

this cannot clear bob.b.value.

union bob2 {
  FooAfterBase a;
  FooAfter b;
};

bob2.b.value = 3.14;
memset( &bob2.a, 0, sizeof(bob2.a) );

this is undefined behavior.

Oarlock answered 18/12, 2018 at 16:59 Comment(8)

"this cannot clear bob.b.value." Since FooBefore is not standard layout, the common-initial-sequence rule doesn't apply. So you can't access bob.a after you've set bob.b. – Ailey 18/12, 2018 at 17:4

@NicolBolas Why would FooBefore not be standard layout? – Bedelia 18/12, 2018 at 17:8

@Holt: both FooBeforeBase and FooBefore has non-static members, therefore FooBefore has no standard layout. – Eventuality 18/12, 2018 at 17:15

@Bedelia A standard layout type has data members either in a base class or not in a base class, but not both. en.cppreference.com/w/cpp/named_req/StandardLayoutType – Marrissa 18/12, 2018 at 17:19

@Holt: It's the "has no element of the set" part. That's what it says once you untangle all of the spec-language. – Ailey 18/12, 2018 at 17:19

CIS doesn't matter anyway. It allows reading through a different member, not writing. – Gujarat 20/12, 2018 at 7:49

@t.c that is a surprise to me. You certain? – Oarlock 20/12, 2018 at 11:9

Yes. See [class.mem]/25 in the current WP. – Gujarat 20/12, 2018 at 11:29

-1

FooBefore is not std-layout either; two classes are declaring none-static data members(FooBefore and FooBeforeBase). Thus the compiler is allowed to arbitrarily place some data members. Hence the differences on different tool-chains arise. In a std-layout hierarchy, atmost one class(either the most derived class or at most one intermediate class) shall declare none-static data members.

Ardith answered 18/12, 2018 at 18:41 Comment(0)

-2

Here's a similar case as n.m.'s answer.

First, let's have a function, which clears a FooBeforeBase:

void clearBase(FooBeforeBase *f) {
    memset(f, 0, sizeof(*f));
}

This is fine, as clearBase gets a pointer to FooBeforeBase, it thinks that as FooBeforeBase has standard-layout, so memsetting it is safe.

Now, if you do this:

FooBefore b;
b.value = 42;
clearBase(&b);

You don't expect, that clearBase will clear b.value, as b.value is not part of FooBeforeBase. But, if FooBefore::value was put into tail-padding of FooBeforeBase, it would been cleared as well.

Is there a required layout per the standard and, if not, why do gcc and clang do what they do?

No, tail-padding is not required. It is an optimization, which gcc and clang do.

Eventuality answered 18/12, 2018 at 17:49 Comment(5)

But the standard doesn't allow clearBase to work on a base class subobject. Well, if we're going to be technical, memset isn't allowed on TriviallyCopyable types period, but even if that were a memcpy from a zero-initialized FooBeforeBase static instance, it still wouldn't be allowed on base class subobjects. – Ailey 18/12, 2018 at 17:51

@NicolBolas: I nowhere stated that. As a user of clearBase, you may not know, what is inside. So, this behavior of the compiler guarantees that you don't shoot yourself in the foot. Please be a little bit more practical here, even if the question has the language-lawyer tag. We already talking about something, which is not covered by the standard (i.e., tail-padding optimization). – Eventuality 18/12, 2018 at 17:55

"I nowhere stated that". You very much did, right after you said: "Now, if you do this:". That's code which calls clearBase on a base class subobject, which is UB. – Ailey 18/12, 2018 at 17:56

"But the standard doesn't allow clearBase to work" Rather, it doesn't guarantee that it will work. – Marrissa 18/12, 2018 at 17:57

@NicolBolas: okay then. This behavior is there too guarantee, that even if it is UB, it doesn't do any harm. UB doesn't automatically mean that something bad must happen. – Eventuality 18/12, 2018 at 18:1

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags