Is reading an uninitialized value always an undefined behaviour? Or are there exceptions to it?
Asked Answered
J

3

6

An obvious example of undefined behavior (UB), when reading a value, is:

int a;
printf("%d\n", a);

What about the following examples?

int i = i;     // `i` is not initialized when we are reading it by assigning it to itself.
int x; x = x;  // Is this the same as above?
int y; int z = y;

Are all three examples above also UB, or are there exceptions to it?

Jester answered 23/5, 2021 at 18:17 Comment(20)
All three have undefined behaviour.Chamfer
some folks here are saying first example is not UB: stackoverflow.com/questions/67662266Jester
Learning about trap representations would be helpful.Severe
@Jester the links from the duplicate you linked in comment, say that it is syntactically legal, but the behaviour is undefined.Chamfer
They are saying using i later on will be undefined. I'm arguing int i=i; itself is undefined behaviour.Jester
I think the ground has been fairly well covered already. You are asking the same question that was recently closed as a duplicate.Chamfer
but never got to a concussion. is it UB or not? all 3 above are?Jester
Reading from an uninitialised variable is undefined behaviour. Right there and then.Chamfer
ye but some are saying int i = i; isn't actually reading i, hence NOT UB. please check the comments on the link.Jester
If you think that int i = i; isn't doing anything, why is it in your code?Chamfer
int i = i; is semantically equivalent to int i; i = i; Both are UB. But just because you have UB in your code doesn't mean the compiler have to do something about it, it's part of the whole undefined bit. A decent compiler will be able to detect it and warn you about it though, but from the compilers point of view it's not an error.Severe
@Someprogrammerdude LOL he has other question, and I have been corrected few times before, but int i; i = i invokes UBs, but is int i = i in itself is not UB, per say?Turnsole
@KPCT they are all UB.Jester
It is not because of what the compiler, or the processor might or might not do, but because the C standard says so although I can't find the relevant clause.Chamfer
@Jester is it OK if I tweak your question?Turnsole
Just saying, the rules might be slightly different in C vs C++. The links I posted under your previous question were for C++.Danica
@Danica welcome to this DUPE. :)Turnsole
@Danica I have a question for you. Look at this other way int a; ... might cause UB, but int a; a = 5 is not UB. thus int a = a it depends on next line. So int a = a; a = 5 is this UB, I do not think so?Turnsole
For more in-depth analysis see stackoverflow.com/questions/25074180 . (I would suggest this is a duplicate; even though the question is not exactly the same, the same answers do apply)Lodestone
It's UB. For example in Itanium reading an uninitialized variable triggers a trap. See Is a^a or a-a undefined behaviour if a is not initialized?Iambic
T
10

Each of the three lines triggers undefined behavior. The key part of the C standard, that explains this, is section 6.3.2.1p2 regarding Conversions:

Except when it is the operand of the sizeof operator, the _Alignof operator, the unary & operator, the ++ operator, the -- operator, or the left operand of the . operator or an assignment operator, an lvalue that does not have array type is converted to the value stored in the designated object (and is no longer an lvalue); this is called lvalue conversion. If the lvalue has qualified type, the value has the unqualified version of the type of the lvalue; additionally, if the lvalue has atomic type, the value has the non-atomic version of the type of the lvalue; otherwise, the value has the type of the lvalue. If the lvalue has an incomplete type and does not have array type, the behavior is undefined. If the lvalue designates an object of automatic storage duration that could have been declared with the register storage class (never had its address taken), and that object is uninitialized (not declared with an initializer and no assignment to it has been performed prior to use), the behavior is undefined.

In each of the three cases, an uninitialized variable is used as the right-hand side of an assignment or initialization (which for this purpose is equivalent to an assignment) and undergoes lvalue to rvalue conversion. The part in bold applies here as the objects in question have not been initialized.

This also applies to the int i = i; case as the lvalue on the right side has not (yet) been initialized.

There was debate in a related question that the right side of int i = i; is UB because the lifetime of i has not yet begun. However, that is not the case. From section 6.2.4 p5 and p6:

5 An object whose identifier is declared with no linkage and without the storage-class specifier static has automatic storage duration, as do some compound literals. The result of attempting to indirectly access an object with automatic storage duration from a thread other than the one with which the object is associated is implementation-defined.

6 For such an object that does not have a variable length array type, its lifetime extends from entry into the block with which it is associated until execution of that block ends in any way. (Entering an enclosed block or calling a function suspends, but does not end,execution of the current block.) If the block is entered recursively, a new instance of the object is created each time. The initial value of the object is indeterminate. If an initialization is specified for the object, it is performed each time the declaration or compound literal is reached in the execution of the block; otherwise, the value becomes indeterminate each time the declaration is reached

So in this case the lifetime of i begins before the declaration in encountered. So int i = i; is still undefined behavior, but not for this reason.

The bolded part of 6.3.2.1p2 does however open the door for use of an uninitialized variable not being undefined behavior, and that is if the variable in question had it's address taken. For example:

int a;
printf("%p\n", (void *)&a);
printf("%d\n", a);

In this case it is not undefined behavior if:

  • The implementation does not have trap representations for the given type, OR
  • The value chosen for a happens to not be a trap representation.

In which case the value of a is unspecified. In particular, this will be the case with GCC and Microsoft Visual C++ (MSVC) in this example as these implementations do not have trap representations for integer types.

Tullis answered 23/5, 2021 at 19:28 Comment(2)
Look at this other way int a; ... might cause UB, but int a; a = 5 is not UB. thus int a = a it depends on next line. So int a = a; a = 5 is this UB, I think not?Turnsole
@KPCT The second line doesn't matter. int a = a; by itself is undefined behavior as per 6.3.2.1p1. If at some point later &a was used then it might not be UB depending on the implementation.Tullis
O
1

Use of the not initialized automatic storage duration objects invokes UB.

Use of the not initialized static storage duration objects is defined as they are initialized to 0s

int a;

int foo(void)
{
    static int b;
    int c;
    
    int d = d;           //UB
    static int e = e;    //OK

    printf("%d\n", a);   //OK
    printf("%d\n", b);   //OK
    printf("%d\n", c);   //UB

}
Overkill answered 23/5, 2021 at 18:31 Comment(4)
is UB or not when.. was this an answer or a question?Jester
Assume compiler optimization is disabled, then is it UB on its own?Jester
That doesn't mean anything. It could still provide assembly but be UB in general. maybe it woks in one compiler but not in another.Jester
@Jester yes clang generates the code. It is UBOverkill
E
1

In cases where an action on an object of some type might have unpredictable consequences on platforms where the type has trap representations, but have at-least-somewhat predictable behavior for types that don't, the Standard will seek to avoid distinguishing platforms that do or don't define the behavior by throwing everything into the catch-all category of "Undefined Behavior".

With regard to the behavior of uninitialized or partially-initialized objects, I don't think there's ever been a consensus over exactly which corner cases must be treated as though objects were initialized with Unspecified bit patterns, and which cases need not be treated in such fashion.

For example, given something like:

struct ztstr15 { char dat[16]; } x,y;
void test(void)
{
  struct zstr15 hey;
  strcpy(hey.dat, "Hey");
  x=hey;
  y=hey;
}

Depending upon how x and y will be used, there are at least four ways it might be useful to have an implementation process the above code:

  1. Squawk if an attempt is made to copy any automatic-duration object that isn't fully initialized. This could be very useful in cases where one must avoid leakage of confidential information.

  2. Zero-fill all unused portions of hey. This would prevent leakage of confidential information on the stack, but wouldn't flag code that might cause such leakage if the data weren't zero-filled.

  3. Ensure that all parts of x and y are identical, without regard for whether the corresponding members of hey were written.

  4. Write the first four bytes of x and y to match those of hey, but leave some or all of the remaining portions holding whatever they held before test() was called.

I don't think the Standard was intended to pass judgment as to whether some of those approaches would be better or worse than others, but it would have been awkward to write the Standard in a manner that would define behavior of test() while allowing for option #3. The optimizations facilitated by #3 would only be useful if programmers could safely write code like the above in cases where client code wouldn't care about the contents of x.dat[4..15] and y.dat[4..15]. If the only way to guarantee anything about the behavior of that function would be to write all portions of hey were written, including those whose values would be irrelevant to program behavior, that would nullify any optimization advantage approach #3 could have offered.

Elisabeth answered 10/10, 2022 at 22:30 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.