Does C++ standard guarantee the initialization of padding bytes to zero for non-static aggregate objects?
Asked Answered
P

1

23

Does C++ support a language construct that will allow us to initialize an object and all its padding fields to zero. I found some encouraging wording in cppreference.com about zero-initialization that suggests that on some conditions, the padding bytes will also be zeroed out.

Quoting from cppreference.com: zero-initialization

Zero initialization is performed in the following situations:

  1. As part of value-initialization sequence for non-class types and for members of value-initialized class types that have no constructors, including value initialization of elements of aggregates for which no initializers are provided.

The effects of zero initialization are:

  • If T is a scalar type, the object's initial value is the integral constant zero explicitly converted to T.
  • If T is an non-union class type, all base classes and non-static data members are zero-initialized, and all padding is initialized to zero bits. The constructors, if any, are ignored.
  • ...

One will find references to zero-initialization in value-initialization, aggregate-initialization and list-initialization.

I tested using fairly latest GCC and clang C++ compilers, and their behavior seems divergent.

Frankly, I tried hard to parse these rules, especially given that the divergent compiler behavior, I could not figure out how to interpret these rules correctly.

See code here (min C++11 is required). And here are the results:

Given: Foo

struct Foo
{
    char x;
    int y;
    char z;
};
Construct g++ clang++
Foo() x:[----][0x42][0x43][0x44],v: 0 x:[----][----][----][----],v: 0
y:[----][----][----][----],v: 0 y:[----][----][----][----],v: 0
z:[----][0x4A][0x4B][0x4C],v: 0 z:[----][----][----][----],v: 0
Foo{} x:[----][----][----][----],v: 0 x:[----][0x42][0x43][0x44],v: 0
y:[----][----][----][----],v: 0 y:[----][----][----][----],v: 0
z:[----][----][----][----],v: 0 z:[----][0x4A][0x4B][0x4C],v: 0

Here [----] represents a byte containing all bits 0, and [0x..] is garbage value.

As you can see the compiler outputs indicate that padding is not initialized. Both Foo() and Foo{} are value-initializations. In addition Foo{} is an aggregate-initialization, with missing initializers. Why isn't the zero-initialization rule getting triggered? Why isn't padding rule getting triggered?

I already understand that relying on padding bytes to be zero is not a good idea or may even be undefined behavior, but I think that is besides the point of this question.

  • Question 1: Does the standard provide a way to reliably initialize the padding bytes?
  • Question 2: Also see: does c initialize structure padding. Is it applicable?
  • Question 3: Are these compilers compliant with the standards?
  • Question 4: What is the explanation of the compiler's clearly divergent behavior?
Prepotent answered 3/2, 2022 at 22:17 Comment(9)
Why didn't you tag it c++ and maybe also language-lawyer?Veta
In your code you are also using C++20 specifically. If you don't intend to ask about a specific language version, I would suggest removing all of the version-specific tags.Wakeen
I believe zero-initialization applies only to static/thread storage duration objects. Dynamic and Automatic objects don't (be default) get their padding zero'd out) unless you explicitly zero-initialization them, as that is an extra runtime cost.Minter
Why do you care about padding initialization? If you rely on specific values of padding, why won't you make padding explicit members, so that you can rely on the standard requirements and guarantees for members? After all, initializing padding is wasted CPU cycles, which is against the C++ principle of not paying for what you don't use.Gluteus
I would note that you have -O3 defined for the compilers. The compiler can do almost anything as long as there is no observable difference in behavior. Is padding observable.Minter
@MaximEgorushkin, for one, uninitialized padding pollutes Valgrind and memory sanitiser reports. It'd be nice to have a way to force C++ to zero-initialise everything with a compiler flag.Impeccable
@Impeccable That valgrind warning is because code intentionally reads uninintialzed memory. The warning is valid, initializing that object with value initialization fixes the warning, no new compiler flag is necessary.Gluteus
@MaximEgorushkin the GCC and LLVM developers thought it was necessary to add the -ftrivial-auto-var-init=choice flagImpeccable
@Impeccable But for a different reason than valgrind correctly flagging reads of uninitialized variables. You are clutching at the straws here conflating different things.Gluteus
W
13

Update:

I should clarify that my answer and my comments below assume that padding having a certain state is meaningful to observable behavior in the first place, i.e. that, absent any other modification of the object or the complete object to which it belongs, the padding of the object will remain in its given state and can be, potentially, read back with that value.

However, the standard says practically nothing about the behavior of padding and going by information I could find from WG21, the current understanding of the standard seems to be that any padding always has unspecified value in C++. Therefore it would be pointless to ask whether or not the padding should be zeroed or not, as reading it back in any form would not need to produce zero again, effectively permitting the compiler to ignore any requirements to initializing padding in any way.

Notably that is different than in C, where the standard explicitly specified conditions under which padding might take on unspecified values and when it must be stable.

See e.g. the comment on CWG 2536.


The padding bits will be zeroed only if the class object is zero-initialized, as expressed in your quote.

For automatic and dynamic storage duration objects zero-initialization happens only if the object is value-initialized and it has a non-deleted implicit default constructor and no other user-provided default constructor. [dcl.init.general]/8.1 These conditions are fulfilled here.

Value-initialization should always happen with the () initializer. ([dcl.init.general]/16.4)

Value-initialization could also happen for {} as initializer. However, if the class is an aggregate as it is here, aggregate-initialization is preferred, which doesn't result in value-initialization. ([dcl.init.list]/3.4)

The preference of aggregate-initialization over value-initialization was changed by CWG 1301 before C++14, which may also be intended to apply to C++11. Before C++11 the rules may have been different, I haven't checked.


So I would say Clang is behaving correctly and GCC is wrong on Foo() while doing unnecessary work for Foo{} (although as noted by @PeterCordes below zeroing the whole object including the padding is actually more efficient).


Note that it is not completely clear to me whether inspecting the values of the non-zero-initialized padding bytes has well-defined behavior the way you are doing it.

For the default-initialized case reading the member has undefined behavior, because it's value will be indeterminate.

I expect that the padding is also supposed to have indeterminate values before new potentially initializes them. In that case inspecting their values if there is no zero-initialization would cause undefined behavior.

Wakeen answered 3/2, 2022 at 23:23 Comment(12)
It seems gcc and clang are doing constant-propagation from the init loop after malloc (in the operator new overload) into the constructor, so it actually just does one store. Commenting that out, we can see in the asm when GCC or clang choose to just do 2 byte stores and a dword store, or zero the whole object with a dword + qword store. gcc.godbolt.org/z/9M5ox6KPc Interestingly, they choose opposite: for new Foo() gcc does 3 separate stores avoiding padding, clang does 2 that cover the whole object. For new Foo{} that's reversed.Actuary
(So GCC's "unnecessary work" on Foo{} is actually the more efficient way, and what they should both be doing in both cases, for performance even when not required by correctness. 2 stores are generally better than 3 on modern x86 (and smaller code size), regardless of misalignment, unless it's unlucky with a page-split. But malloc returns 16-byte aligned memory in the ABI its targeting, x86-64 SysV. The object isn't 16 bytes for one xorps xmm0,xmm0 / movups, otherwise that would be much better.)Actuary
@PeterCordes I didn't really think about the code generation. I have added a note referring to your comments. It is however another question whether the standard allows zeroing the padding for Foo{} in the specific case that operator new already set them. I do not expect that it is intended to forbid it, but I am not sure that it is well-specified.Wakeen
100% agreed, that's what I meant by for performance even when not required by correctness. You always have to zero when the standard says you have to. When the standard doesn't require it, it depends on the target ISA and typical micro-architectures whether the optimal strategy for doing the required zeroing also happens to zero the padding. e.g. on 32-bit x86 it probably wouldn't, since you don't have qword integer stores so you need 3 stores unless you use SSE2 movq or movlps, and mov dword mem, imm32 takes more code size than mov byte mem, imm8.Actuary
I've noticed GCC missed optimizations in the past where the standard doesn't forbid zeroing or otherwise stepping on padding, but GCC chooses not to, to the detriment of performance. It makes sense that compilers need to be careful not to invent writes in general, so perhaps they don't have an easy way to keep track of memory that they're allowed to step on but don't need to modify. The target-independent front-end would probably only want to tell the back-end optimizer about what actually has to be done, but maybe doesn't communicate that padding can be written when it doesn't have to be.Actuary
The point of investigating was to see if GCC and clang behaved differently on actually fresh memory where they couldn't see any previous writes; I was curious if an overloaded new that did initialization was making a difference to gcc or if it was always buggy even in the normal case. As well as the performance considerations, whether clang would ever leave any memory unwritten (yes) when there isn't data from new's init loop to store along with the constructor values.Actuary
@PeterCordes. See gcc.godbolt.org/z/aYsvqoWxE. In this variation of my original program, I am replacing malloc/free with my custom allocation instead of overriding new/delete directly. I believe that now operator new would simply call malloc and compiler will not be able to see through it. Would this qualify? Still the compiler behavior with regards to padding is unchanged.Prepotent
@ShriramV: I think we were already pretty sure this GCC bug where it doesn't do enough zeroing for Foo() was basically independent of where the original memory was coming from. Would probably also happen for locals constructed on the stack. (Probably easier to look at the asm than to construct a test case that dirties some stack memory before another function initializes locals.)Actuary
@PeterCordes The behavior for Foo() is the same for a local variable, but interestingly Clang does the full zeroing with Foo{} with a local variable. This makes me think that Clang's developers maybe did think that clearing the padding is not allowed in that case if the memory had been set before, but maybe it is just some unintended consequence of how the translation process works. gcc.godbolt.org/z/vKf133jbhWakeen
@PeterCordes Just for a more complete picture: MSVC does the full zeroing for both Foo() and Foo{} for stack and new and ICC is doing something completely different for different optimization levels?: gcc.godbolt.org/z/4WjGjnn87Wakeen
I am so worried about this. This has been used for data exploits. See wiki.sei.cmu.edu/confluence/display/cplusplus/…Gibson
@Gibson Interestingly the description of the second non-compliant example is wrong. test arg{}; is not value-initialization of the whole structure after the defect report I mentioned. Of course relying on value-initialization at all is a problem if compilers don't implement it conform. The compliant solutions given on the page fortunately don't rely on it.Wakeen

© 2022 - 2024 — McMap. All rights reserved.