In C++, does initializing a global variable with itself have undefined behaviour?
Asked Answered
E

4

68
int i = i;

int main() { 
 int a = a;
 return 0;
} 

int a = a surely has undefined behaviour (UB), and more details on it is in Is reading an uninitialized value always an undefined behaviour? Or are there exceptions to it?.

But what about int i = i? In C++ we are allowed to assign nonconstant values to globals. i is declared and zero initialized (since it has file scope) before the declaration is encountered. In which case we are assigning 0 to it later in the definition. Is it safe to say this does not have UB?

Eldoraeldorado answered 15/6, 2021 at 2:25 Comment(3)
Yes, this is safe as you say because objects of static storage duration are zero-initialized before any other initializationFrayne
File scope is a concept from C. Corresponding concept in C++ is namespace scope.Graphic
The reason initializers have visibility over the identifier being initialized is to that recursive/circular references are possible like struct circular_list x = { &x, &x }. That's what it's for.Erdda
P
61

Surprisingly, this is not undefined behavior.

Static initialization [basic.start.static]

Constant initialization is performed if a variable or temporary object with static or thread storage duration is constant-initialized. If constant initialization is not performed, a variable with static storage duration or thread storage duration is zero-initialized. Together, zero-initialization and constant initialization are called static initialization; all other initialization is dynamic initialization. All static initialization strongly happens before any dynamic initialization.

Important parts bold-faced. "Static initialization" includes global variable initialization, "static storage duration" includes global variables, and the above clause is applicable here:

int i = i;

This is not constant-initialization. Therefore, zero-initialization is done according to the above clause (for basic integer types zero-initialization means, unsurprising, that it's set to 0). The above clause also specifies that zero initialization must take place before dynamic initialization.

So, what happens here:

  1. i is initialized to 0.
  2. i is then dynamically initialized, from itself, so it still remains 0.
Psalter answered 15/6, 2021 at 2:44 Comment(19)
That's hilarious. This lets you create a non-constructible object! struct S { S() = delete; } s = s;Exurb
@RaymondChen Actually useful for pre C++20 captureless lambdas, which don't have a default constructor: https://mcmap.net/q/296732/-construct-an-empty-object-without-the-default-constructor .Disrepute
How would that change for global alias ? int & i = i;Serotine
above is a good question. Also like to know its difference in file scope vs block scope.Eldoraeldorado
@Eldoraeldorado in block scope, on the stack, it is not zero-initialized but left garbage. Thus, you are reading from uninitialized variable,Percyperdido
@JDługosz so it's just a reference to itself. Works just like a normal variable. globally the reference is zero initialized, locally is garbage value and UB ?Eldoraeldorado
@Eldoraeldorado see en.cppreference.com/w/cpp/language/lifetime and en.cppreference.com/w/cpp/language/storage_duration .Percyperdido
@RaymondChen it looks it also triggers wrong clang warning :) godbolt.org/z/hjG7o7aeEConsumption
But when does the lifetime of i start? When the static init is complete, or when the dynamic init is complete? If it's the former, then what about e.g. a global std::string s;? Is attempting to read from it before dynamic init finishes somehow not UB?Shack
@RaymondChen: don't we have the same with brace initialization? Like struct S { S() = delete; }; S s{};Haemal
@Shack the lifetime begins at the start of program execution. Are you confusing it with scope ? These words have precise definitions in C++ and if you don't use the right terminology you wind up with word salad. The scope of i begins after its declarator is parsed. That is, when it finds the =.Percyperdido
We don't know whether i is still zero at dynamic time, because global constructors could have changed it, right? It' as if an assignment i = i happens at the dynamic initialization time. Since that is a noop, it could be optimized away.Erdda
@JDługosz My terminology looks ok to me. "lifetime begins at the start of program execution" Citation needed. See my answer for a standard quote, that says it begins when the initialization of i is complete.Shack
@Shack The lifetime is when (in time) the value is valid. The scope is where in the source code that name can be referred to. For the i in the OP code, which is an int in namespace scope (not any other i defined in the comments thread), the lifetime starts when the program loads, before any code is actually executed. It will always be a valid int, and will be loaded as 0 with the code image.Percyperdido
@JDługosz Yes, I meant "lifetime". Again, can you provide the source for "lifetime starts when the program loads"? You might be interested in this question.Shack
Here's constant initialization, "it's guaranteed that it is complete before any other initialization of a static or thread-local object begins, and it may be performed at compile time." So if you're questioning whether "loaded with the image" is formal standard, I agree that although that's what happens on real compilers, the standard can't use such terms, rather, it's just already there no matter how early the code accesses it.Percyperdido
@JDługosz "questioning whether "loaded with the image" is formal" Not really, I question something else. Please read the link in my previous comment, it explains my doubts.Shack
@HolyBlackCat: IMHO, the way the C++ Standard attempts to describe the lifetime of standard-layout objects is a needlessly leaky abstraction which leads to ambiguous and broken corner cases. I think it was adopted because specifying that every region of storage that doesn't contain any non-standard layout objects simultaneously contains all standard-layout objects that will fit, and changing one will affect others that share the same storage, would inhibit optimization, but it would have been better to write rules about what objects can be addressed or accessed when than to adopt...Fushih
...a leaky abstraction about what objects "exist". I doubt that the people who formulated the C++ abstraction really considered all possible corner cases, or would have a consensus answer as to what the Standard is supposed to mean in all of them.Fushih
S
5

The behavior might be undefined for i, since depending on how you read the standard, you could be reading i before its lifetime starts.

[basic.life]/1.2

... The lifetime of an object of type T begins when:

— its initialization (if any) is complete ...

As mentioned in the other answer, i is initialized twice: first zero-initialized statically, then initialized with i dynamically.

Which initialization starts the lifetime? The first one or the final one?

The standard is being vague, and there are conflicting notes in it (albeit all of them are non-normative). Firstly, there is a footnote in [basic.life]/6 (thanks @eerorika) that explicitly says that the dynamic initialization starts the lifetime:

[basic.life]/6

Before the lifetime of an object has started but after the storage which the object will occupy has been allocated26

...

26) For example, before the dynamic initialization of an object with static storage duration ...

This interpretation makes the most sense to me, because otherwise it would be legal to access class instances before they undergo dynamic initialization, before they could estabilish their invariants (including the standard library classes defined by the standard).

There's also a conflicting note in [basic.start.static]/3, but that one is older than the one I mentioned above.

Shack answered 15/6, 2021 at 18:41 Comment(7)
No it's not. int i = i; is equal to int i; i = i;. https://mcmap.net/q/296731/-is-reading-an-uninitialized-value-always-an-undefined-behaviour-or-are-there-exceptions-to-itEldoraeldorado
Interesting. The definition of lifetime changed between C++17 and C++20, and this footnote changed to match. This does contradict the example in [basic.start.dynamic]/5, but examples are also non-normative, and it seems likely they just missed updating this example.Adamantine
@Eldoraeldorado That question is about C, and int i; i = i; isn't even legal at namespace scope.Adamantine
@Adamantine Can you explain the contradiction?Shack
If anything, I think this (non-normative) footnote contradicts the (non-normative) note in [basic.start.static]/3.Symbolic
@Symbolic Argh. Edited.Shack
The definitions changed... makes sense, as now we have constexpr constructors. So you can't just say that primitive types work one way and types with constructors have separate storage allocation and initialization.Percyperdido
L
1

It appears to me int i = i; has undefined behavior, is not caused by the indeterminate value. The term indeterminate value is designed for the objects that have automatic or dynamic storage duration.

[basic.indet#1]

When storage for an object with automatic or dynamic storage duration is obtained, the object has an indeterminate value, and if no initialization is performed for the object, that object retains an indeterminate value until that value is replaced ([expr.ass]).

[basic.indet#2]

If an indeterminate value is produced by an evaluation, the behavior is undefined except in the following cases...

In your example, the object named i has a static storage duration, hence it is not within the extent of talking about indeterminate value. And, such an object has a zero-initialization that happens before any dynamic initialization as per [basic.start.static#2]

Together, zero-initialization and constant initialization are called static initialization; all other initialization is dynamic initialization. All static initialization strongly happens before ([intro.races]) any dynamic initialization.

Hence, its initial value is zero. when i is used as an initializer to initialize itself. which is a dynamic initialization and it obeys [dcl.init].

Otherwise, the initial value of the object being initialized is the (possibly converted) value of the initializer expression.

It violates the rule in [basic.lifetime]

The program has undefined behavior if:

  • the glvalue is used to access the object, or
Lazarus answered 10/8, 2021 at 8:4 Comment(0)
J
1

While the given answers provide plausible explanations, there is no definitive answer. Your question boils down to CWG Issue 2821: Lifetime, zero-initialization, and dynamic initialization . In the case of

int i = i;

... there are three possible interpretations, none of which is obviously wrong:

  1. Zero-initialization begins the lifetime of i, and then dynamic initialization would modify the value of i ([defns.access]).

  2. Zero-initialization begins the lifetime of i, and then dynamic initialization transparently replaces ([basic.life]) i with a completely new object.

  3. Zero-initialization does not begin the lifetime of i, meaning that i on the right-hand side is performing a value computation of an object whose lifetime hasn't started. This would be undefined behavior.

In any case, the committee seems intent on making this int i = i; valid; the question is just how that would be put into words.

On the other hand,

int main() {
    int a = a;
}

... is obviously undefined behavior because during this copy-initialization, the lvalue a is used to access a ([basic.life] p7.1), but the lifetime of a hasn't yet begun. Zero-initialization does not bail us out here.

Joann answered 25/4 at 10:12 Comment(1)
Thanks. If one adds constexpr int i = i; then all compilers agree about read of object outside its lifetime or read of an uninitialized symbol.Stockbreeder

© 2022 - 2024 — McMap. All rights reserved.