C++11 empty list Initialization of a union - is it guaranteed to initialize the full length of the union?
Asked Answered
M

2

11

In C++11, I have the following union:

union SomeData
{
    std::uint8_t Byte;
    std::uint16_t Word;
    std::uint32_t DWord;
    unsigned char String[128];
};

If I initialize the union thusly;

SomeData data {};

Is it guaranteed that the entire contents of the union will be "zero'd" out? Put another way; is an empty list-initializer of a union functionally equivalent to memset-ing the union to Zero?:

memset(&data, 0, sizeof(data));

In particular, I'm concerned about the string data. I'd like to ensure the entire length of the string contains zeros. It appears to work in my current compiler, but does the language of the spec guarantee this to always be true?

If not: is there a better way to initialize the full length of the union to zero?

Magnification answered 1/3, 2017 at 14:56 Comment(0)
A
6

No, it is not guaranteed that the entire union will be zeroed out. Only the first declared member of the union, plus any padding, is guaranteed to be zeroed (proof below).

So to ensure the entire memory region of the union object is zeroed, you have these options:

  • Order the members such that the largest member is first and thus the one zeroed out.
  • Use std::memset or equivalent functionality. To prevent accidentally forgetting that, you can of course give SomeData a default constructor which will call this.

Quoting C++11:

8.5.4 [dcl.init.list]/3

List-initialization of an object or reference of type T is defined as follows:

  • If the initializer list has no elements and T is a class type with a default constructor, the object is value-initialized.

8.5 [dcl.init]/7

To value-initialize an object of type T means:

  • if T is a (possibly cv-qualified) class type (Clause 9) with a user-provided constructor (12.1), then the default constructor for T is called (and the initialization is ill-formed if T has no accessible default constructor);
  • if T is a (possibly cv-qualified) non-union class type without a user-provided constructor, then the object is zero-initialized and, if T’s implicitly-declared default constructor is non-trivial, that constructor is called.
  • ...
  • otherwise, the object is zero-initialized.

8.5 [dcl.init]/5:

To zero-initialize an object or reference of type T means:

...

  • if T is a (possibly cv-qualified) union type, the object’s first non-static named data member is zero-initialized and padding is initialized to zero bits;

From these quotes, you can see that using {} to initialise data will cause the object to be value-initialized (since SomeData is a class type with a default constructor).

Value-initializing a union without a user-provided default constructor (which SomeData is) means zero-initializing it.

Finally, zero-initializing a union means zero-initializing its first non-static named data member.

Aristotle answered 1/3, 2017 at 15:0 Comment(18)
That's a bummer. Is there a simple way to initialize the full union to zero? Or is 'memset' the only way?Magnification
@BTownTKD Put the largest member first?Flanigan
Oh. Yeah. Durp.Magnification
My understanding of padding is initialized to zero bits is that the remaining part of the union will be set to 0.Naraka
@SergeBallesta I tested it on an exemple of a related question, It seems you are right! I edit my answer, now it forward to your answer!Horatius
This is core issue 694. The stated intent is to zero out the entire union.Twofold
@SergeBallesta I'm not quite sure that the space reserved for inactive members of the union counts as padding. Padding is normally the area outside of members, introduced for alignment purposes.Aristotle
@Twofold Unfortunately, "padding" is wonderfully underspecified in the standard (it doesn't even have an index entry!). Are the parts of a union's inactive members which don't overlap the active member really padding? Wouldn't padding be more like extra space at the end for array-alignment purposes?Aristotle
@Angew, Here is a quote of issue 694, that explain why the C commitee added "and padding is zero-initialized": The C committee is considering changing the definition of zero-initialization of unions to guarantee that the bytes of the entire union are set to zero before assigning 0, converted to the appropriate type, to the first member. But it seems to apply to zero initialization, which is performed only on static object no?Horatius
@Horatius If you read the quotes in the A, you'll see that zero-init happens as part of this value-init. And I've read the rationale, which would IMO be realised by the proposed 2008 resolution. However, I cannot be sure that the actually accepted 2010 resolution has the same effect, especially since it's preceded with "The C Committee has changed its approach to this question". It's very possible the intent is to zero-out the entire union, but I don't really see how this wording about padding guarantees that.Aristotle
@Angew 9.5 Unions [class.union] says Each non-static data member is allocated as if it were the sole member of a struct.. So at first member initialization time, it should be considered at the sole member and padding should be all bytes after it - But I would not rely too much on all C++ implementers understanding the standard that way ;-)Naraka
@SergeBallesta Good find, it seems that's the intent, then. However, as you say, it's far from unambiguous.Aristotle
@Angew: even it your answer is the accepted one, and if I reallly think that the intent is that all bytes of the union are set to 0, I've added a warning in my answer :-)Naraka
@Angew, @SergeBallesta, @Angew Is the union SomeData not an aggregate? This is important because according to the standard, aggregate initialization does not cause zero initialization, no?Horatius
@Horatius Doesn't matter. Aggregate initialisation is the second bullet point under 8.5.4/3 (and starts with "Otherwise"), while value initialisation is the first (the one I quoted). So, since the first bullet point applies, value-init happens and aggregate-init does not.Aristotle
@Angew, You are right, my reference was the C++14 standard, in this last standard, the first bullet only applies to copy initilization. With this change to the standard, is there any risk that compilers remove this zero initialization?Horatius
@Horatius Wow, that's quite a change! When I have the time, I will definitely incorporate it into the answer (even though the Q specifies C++11, it's definitely very relevant). However, since the brace-init-list is empty, it effectively means that the first member will be initialised from an empty initializer list (C++14 8.5.1/7), which again means value-init. But even less is then known about padding, aparently.Aristotle
@Angew The final wording is basically identical to WG14 N1387, linked in the issue, and the stated intent of that paper is clearly for this wording to reflect the "byte-flooding" behavior.Twofold
N
6

The entire union will be zeroed out. More exactly the first member of the union will be default initialized and all the remaining bytes in the union will be set to 0 as padding.

References (emphasize mine):

8.5 Initializers [dcl.init]
...

5 To zero-initialize an object or reference of type T means:
...
— if T is a (possibly cv-qualified) union type, the object’s first non-static named data member is zero initialized and padding is initialized to zero bits;

That means that the first member of the union (here std::uint8_t Byte;) will be initialized to a 0 and that all other bytes in the union will be set to 0 because they are padding bytes.


But beware. As stated by Angew "padding" is wonderfully underspecified in the standard and a C compiler could interpret that the padding bytes in a union are only the bytes that follow the largest member. I would really find that weird because compatibility changes are specifically documented and previous versions (C) first initialized everything to 0 and next did specific initialization. But a new implementer could not be aware of it...

TL/DR: I really think that the intent of the standard is that all bytes in the union are set to 0 in OP's example, but for a mission critical program, I would certainly add an explicit 0 constructor...

Naraka answered 1/3, 2017 at 15:54 Comment(3)
Interesting! That would be a happy interpretation indeed. Is there anything in the spec which clarifies the definition of "padding" in a union?Magnification
Interesting. It does mean, surely, that everyone needs to put the largest member first in a union to get the entire union initialised to zero, right?Okeefe
@Okeefe More exactly, if the first member in the union is the greastest, you are sure that all bytes in union will be set to 0.Naraka

© 2022 - 2024 — McMap. All rights reserved.