Are default-initialized variables automatically zero?
Asked Answered
W

10

79

If I don't assign a value to a variable when I declare it, does it default to zero or just whatever was previously in the memory?

e.g.

float x;
Woodrowwoodruff answered 17/5, 2011 at 14:48 Comment(13)
Thanks - following on from this then, is there a shortcut to assign zero to all of the following?: float x1, x2, x3, x4, x5, y1, y2, y3, y4, y5;Woodrowwoodruff
Default behaviour is an "undefined" value but some compilers will fill it with null but you can't expect that as default behaviour.Schlenger
x1 = x2 = x3 = x4 = x5 = y1 = y2 = y3 = y4 = y5 = 0.0f;Bermudez
@Chris Visual C++ doesn't seem to like that ...Woodrowwoodruff
Not that I know of. If you store them in an array, you can loop through the array and initialize each element to 0.Decani
@Matt: you still have to declare the variables first.Claviform
float x[6] = {};float y[6] = {};Rutledge
"some compilers will fill it" -- No they won't; no compiler does that, and if you find one that does, don't use it. "Visual C++ doesn't seem to like that ..." -- only if you misread and misapplied it.Twist
Note that this answer has been retagged to only concern C++, as that's what the accepted answer discusses. C and C++ are two very distinct languages with their own standards. The C equivalent of this question is here.Eustacia
I guess questions that apply to a different standard are better posted as new ones (I don't think they are dupes). Otherwise I feel I could miss a very good answer, applicable to previous standards, buried beneath a newer answer. Plus, I do not see any con to asking a separate question, and possibly linking the old one.Readiness
@sancho.sReinstateMonicaCellio Oh, maybe I should have asked on Meta first, but I thought that editing or adding answers was the usual approach to this problem. Also, I wasn't really happy with any of the answers. Multiple times already I intended to use this question as duplicate, but the correct and important information is scattered throughout the answers.Bought
@Bought - This is my opinion (and possibly good for Meta), according to how I would most likely grasp the information when I read it.Readiness
To begin with, when you declare (also interesting) a variable there is no memory allocated, and it is not assigned anything. This is true for any C++ standard. I guess you meant "If I don't assign a value to a variable when I define it, does it default to zero or just whatever was previously in the memory?". In that case, I guess you meant "... does it default to anything..." (not necessarily referring to an int or any other type/class that has a "zero").Readiness
D
72

A declared variable can be Zero Initialized, Value Initialized or Default Initialized.

The C++03 Standard 8.5/5 aptly defines each:

To zero-initialize an object of type T means:

— if T is a scalar type (3.9), the object is set to the value of 0 (zero) converted to T;
— if T is a non-union class type, each nonstatic data member and each base-class subobject
is zero-initialized;
— if T is a union type, the object’s first named data member is zero-initialized;
— if T is an array type, each element is zero-initialized;
— if T is a reference type, no initialization is performed.

To default-initialize an object of type T means:
— if T is a non-POD class type (clause 9), the default constructor for T is called (and the initialization is ill-formed if T has no accessible default constructor);
— if T is an array type, each element is default-initialized;
— otherwise, the object is zero-initialized.

To value-initialize an object of type T means:
— if T is a class type (clause 9) with a user-declared constructor (12.1), then the default constructor for T is called (and the initialization is ill-formed if T has no accessible default constructor);
— if T is a non-union class type without a user-declared constructor, then every non-static data member and base-class component of T is value-initialized;
— if T is an array type, then each element is value-initialized;
— otherwise, the object is zero-initialized

For example:

#include<iostream>
using namespace std;

static int a; //Zero Initialized
int b; //Zero Initialized

int main()
{
    int i;  //Undefined Behavior, Might be Initialized to anything
    static int j; //Zero Initialized

    cout<<"\nLocal Uninitialized int variable [i]"<<i<<"\n";

    cout<<"\nLocal Uninitialized Static int variable [j]"<<j<<"\n";

    cout<<"\nGlobal Uninitialized Static int variable [a]"<<a<<"\n";

    cout<<"\nGlobal Uninitialized int variable [b]"<<b<<"\n";

    return 0;
}

You will notice The results for variable i will be different on different compilers. Such local uninitialized variables SHOULD NEVER be used. In fact, if you turn on strict compiler warnings, the compiler shall report an error about it. Here's how codepad reports it an error.

cc1plus: warnings being treated as errors
In function 'int main()':
Line 11: warning: 'i' is used uninitialized in this function

Edit: As rightfully pointed out by @Kirill V. Lyadvinsky in the comments, SHOULD NEVER is a rather very strong word, and there can be perfectly valid code which might use uninitialized variables as he points out an example in his comment. So, I should probably say:
You should never be using uninitialized variables unless you know exactly what you are doing.

Dumah answered 17/5, 2011 at 15:5 Comment(18)
“local uninitialized variables SHOULD NEVER be used ” – Unless you’re seeding a random number generator and using the uninitialised memory as an added source of entropy. ;-)Metagalaxy
"SHOULD NEVER be used" is a strong statement and as we know all strong statements are wrong :) The compiler will give you the same warning if you'll pass an uninitialized variable to a function by reference, but this will be absolutely correct code, for instance: int x; const bool ok = get_value( x ); if ( ok ) { /* use x */ }Overgrow
@Kirill V. Lyadvinsky: I guess you are correct since You proved it through your code example. I will modify the strong statement in the answer. Thanks, for pointing that out.Dumah
@Konrad: no, even then. it isn't guaranteed to be uninitialized memory - the compiler can zero it out if it wants. so unless you want the extra source of entropy to potentially not be one, you shouldn't.Claviform
@Claviform It’s not doing any harm though. As long as you don’t expect any extra entropy from the uninitialised memory (that was the Debian bug if I recall correctly), there’s absolutely no problem of using whatever you can get as an extra bonus.Metagalaxy
@Als: I'm a C coder, not a C++ one. Today, in C, whitespace is cheap: if it isn't expensive in C++ you can use more of it and make your source code easier to parse by humans (subjective, I know).Goree
@Konrad: Actually, it is doing harm. Reading an uninitialized local variable does not give you some garbage value -- it is undefined behavior. There are runtime environments that will shut down the program when reading an uninitialized local variable (Visual Studio Debug Builds are a prominent example).Bartley
@Fred Of course, this needs to be taken into account. But even undefined behaviour isn’t always bad. In fact, some operating system functions require UB to function.Metagalaxy
@Xeo: I assume he means that a given system/hardware/whatever will define behavior that is UB in the C or C++ standards. So for example division by zero is UB, but you might be able to configure your C implementation to deliver a signal, or your C++ implementation to throw an exception, when it occurs. So on that implementation you can do the UB thing which you could not do in portable code.Lattie
@Steve: I meant the "some operating system functions require UB to function" part.Seiler
@Xeo: yes, the delivering of signals to processes in which division-by-zero occurs, which is an OS function, requires UB to function. Specifically, that division-by-zero, despite being UB, results in some kind of hardware interrupt, that the implementation can handle and send the signal. For another example, there may be OS code that deliberately dereferences a null pointer, knowing that it will result in a segfault, because at that point the OS code wants a segfault. And as long as that OS code isn't outwitted by the compiler, it will get it...Lattie
@Xeo: I may of course be wrong what Konrad meant, but he'd certainly be correct to say that OSes rely on the locally-defined behaviour of code that's UB by the standard.Lattie
@Steve: Yeah, but atleast I understand your example now. :) Hah, Microsoft is quite clever, providing its own compiler + implementation so they can rely on their OS behaviour... :)Seiler
@Xeo: and occasionally there are some interesting bunfights between the maintainers of GCC, and the maintainers of glibc or the linux kernel, as to what the locally-defined behavior actually is.Lattie
@Seiler @Steve What I was actually thinking of were issues like byte order, alignment and in particular the effect of reinterpret_cast and reading/writing raw byte representations from and to streams. but Steve’s examples are of course just as valid.Metagalaxy
Using undefined values is UB. SHOULD NEVER is completely correct. Kirill's comment is almost correct - his code is valid (and, yes, a compiler might wrongly warn on it), but that's only because he in fact doesn't use the uninitialised value.Leonoraleonore
Are the lists in the beginning of the answer correct? I think the answer is wrong at least in C++11 and later. The respondent gets the default-initialization for types besides classes with default constructors or arrays thereof dead wrong. Outside those types, default-initialization leaves the object uninitialized! This is a core difference from value-initialization, which does a zero-initialization pass first.Huckaby
I don't see how this is a good answer. It does not actually explain the difference between storage classes and just assumes it in the example. The example also labels the definition itself as undefined behavior, when really only the later use has undefined behavior. Also the explanation "Might be Initialized to anything" is not what "undefined" means. See also discussion in the comments above. It is not clear to me how the list of initialization rules is helpful to understand what is going on.Bought
O
22

It depends. If this is a local variable (an object with automatic storage duration) it will be uninitialized, if it is a global variable (an object with static storage duration) it will be zero initialized. Check also this answer.

Overgrow answered 17/5, 2011 at 14:51 Comment(2)
It also depends on the type - a variable of class type has its default constructor called, and that might do anything (including nothing).Lattie
I think it's better to use static vs. automatic as opposed to global vs. local terminologyBreen
A
11

Since the current top-answer was written in 2011 and only refers to C++03, I am providing an updated answer to keep into account changes made after C++11. Note that I am stripping any information that only held true until C++03 or C++11 and unnecessary notes that can be seen in the original sources. I am quoting the original specifications as much as I can, in order to avoid unnecessary reformulation which may lead to inexact information. Please consult the original sources I am providing if you are interested in diving deeper into a certain topic. Also, be warned that I am mainly focusing on rules regarding * Default initialization * Undefined behavior * Zero-initialization Since it seems to me that these are the most important aspects needed to understand how a variable behaves "by default", as the question is asking.

Default initialization is performed in some cases:

  1. when a variable with automatic, static, or thread-local storage duration is declared with no initializer;
  2. when an object with dynamic storage duration is created by a new-expression with no initializer;
  3. when a base class or a non-static data member is not mentioned in a constructor initializer list and that constructor is called.

and the effects of this default initialization are:

  • if T is a non-POD (until C++11) class type, the constructors are considered and subjected to overload resolution against the empty argument list. The constructor selected (which is one of the default constructors) is called to provide the initial value for the new object;

  • if T is an array type, every element of the array is default-initialized;

  • otherwise, nothing is done: the objects with automatic storage duration (and their subobjects) are initialized to indeterminate values.

Meaning that if the uninitialized variable is a local (say, an int only present in a function's scope), its value is indeterminate (undefined behavior). cppreference strongly discourages the usage of uninitialized variables.

As a side note, even though most modern compilers will issue an error (at compile-time) if they detect that an uninitialized variable is being used, they usually fail to do so in cases were you are "tricking" them to think you may be initializing the variable somehow, such as in:

int main()
{
    int myVariable;
    myFunction(myVariable);  // does not change the variable
    cout << myVariable << endl;  // compilers might think it is now initialized
}

Starting from C++14, the following holds (note that std::byte was introduced with C++17):

Use of an indeterminate value obtained by default-initializing a non-class variable of any type is undefined behavior (in particular, it may be a trap representation), except in the following cases:

  • if an indeterminate value of type unsigned char or std::byte is assigned to another variable of type (possibly cv-qualified) unsigned char or std::byte (the value of the variable becomes indeterminate, but the behavior is not undefined);

  • if an indeterminate value of type unsigned char or std::byte is used to initialize another variable of type (possibly cv-qualified) unsigned char or std::byte;

  • if an indeterminate value of type unsigned char or std::byte (since C++17) results from

    • the second or third operand of a conditional expression,
    • the right operand of the comma operator,
    • the operand of a cast or conversion to (possibly cv-qualified) unsigned char or std::byte,
    • a discarded-value expression.

Additional details about the default initialization of variables and their behavior can be found here.

To dive deeper into indeterminate values, in 2014 the following changes were made (as Shafik Yaghmour pointed out here with additional useful resources):

If no initializer is specified for an object, the object is default-initialized; if no initialization is performed, an object with automatic or dynamic storage duration has indeterminate value. [Note: Objects with static or thread storage duration are zero-initialized]

to:

If no initializer is specified for an object, the object is default-initialized. When storage for an object with automatic or dynamic storage duration is obtained, the object has an indeterminate value, and if no initialization is performed for the object, that object retains an indeterminate value until that value is replaced. [Note: Objects with static or thread storage duration are zero-initialized] If an indeterminate value is produced by an evaluation, the behavior is undefined except in the following cases:

  • If an indeterminate value of unsigned narrow character type is produced by the evaluation of:

    • the second or third operand of a conditional expression (5.16 [expr.cond]),

    • the right operand of a comma,

    • the operand of a cast or conversion to an unsigned narrow character type, or

    • a discarded-value expression,

      then the result of the operation is an indeterminate value.

  • If an indeterminate value of unsigned narrow character type is produced by the evaluation of the right operand of a simple assignment operator whose first operand is an lvalue of unsigned narrow character type, an indeterminate value replaces the value of the object referred to by the left operand.

  • If an indeterminate value of unsigned narrow character type (3.9.1 [basic.fundamental]) is produced by the evaluation of the initialization expression when initializing an object of unsigned narrow character type, that object is initialized to an indeterminate value.

Finally, there is the subject of zero-initialization which is performed in the following situations:

  1. For every named variable with static or thread-local storage duration that is not subject to constant initialization (since C++14), before any other initialization.

  2. As part of value-initialization sequence for non-class types and for members of value-initialized class types that have no constructors, including value initialization of elements of aggregates for which no initializers are provided.

  3. When an array of any character type is initialized with a string literal that is too short, the remainder of the array is zero-initialized.

The effects of zero initialization are:

  • If T is a scalar type, the object's initial value is the integral constant zero explicitly converted to T.

  • If T is an non-union class type, all base classes and non-static data members are zero-initialized, and all padding is initialized to zero bits. The constructors, if any, are ignored.

  • If T is a union type, the first non-static named data member is zero-initialized and all padding is initialized to zero bits.

  • If T is array type, each element is zero-initialized

  • If T is reference type, nothing is done.

Following are some examples:

#include <iostream>
#include <string>

struct Coordinates {
    float x, y;
};

class WithDefaultConstructor {
    std::string s;
}

class WithCustomConstructor {
    int a, b;

public:
    WithCustomConstructor() : a(2) {}
}

int main()
{
    int a;    // Indeterminate value (non-class)

    int& b;   // Error

    std::string myString;    // Zero-initialized to indeterminate value
                             // but then default-initialized to ""
                             // (class, calls default constructor)

    double coordsArray[2];   // Both will be 0.0 (zero-initialization)

    Coordinates* pCoords;    // Zero-initialized to nullptr

    Coordinates coords = Coordinates();

    // x: 0.0
    // y: 0.0
    std::cout << "x: " << coords.x << '\n'
        "y: " << coords.y << std::endl;

    std::cout << a.a << a.b << a.c << '\n';

    WithDefaultConstructor wdc;    // Since no constructor is provided,
                                   // calls the default constructor

    WithCustomConstructor wcs;     // Calls the provided constructor
                                   // a is initialized, while b is
                                   // default-initialized to an indeterminate value
}
Allie answered 13/4, 2020 at 8:6 Comment(3)
Thank you for the answer, but I think it could benefit from one or two examples of variables that will have indeterminate value and variables that will be zero-initialized, as well as maybe an example demonstrating where exactly the use of an indeterminate value causes UB. Also minor nitpicks: I would not call cppreference "the C++ documentation" because it is not an official reference for the standard and "its default behavior is indeterminate" is not really correct. Its value is indeterminate and evaluating such a value is (with exceptions listed) undefined, not indeterminate.Bought
@Bought thanks for pointing that out, my mistake. I corrected the part citing the "documentation". I also added a few examples that hopefully will clarify when/how variables are default-initialized, and also some examples of zero-initialization.Allie
Unfortunately some of your examples are now wrong. No zero initialization happens for myString before the default constructor is called. coordsArray and pCoords are not zero-initialized, so they will have indeterminate values.Bought
R
8

It depends on the lifetime of the variable. Variables with static lifetime are always zero-initialized before program start-up: zero-initialization for basic types, enums and pointers is the same as if you'd assigned 0, appropriately converted to the type, to it. This occurs even if the variable has a constructor, before the constructor is called.

Rattle answered 17/5, 2011 at 14:53 Comment(0)
M
7

This depends on where you declare it. Variables in the global scope are initialized with 0, and stack-variables are undefined.

Metzler answered 17/5, 2011 at 14:52 Comment(1)
as in my comment to @Kirill, better say static/automatic than global/local. Local variable can be static too as you knowBreen
C
4

I think it's undefined. I think some compilers, when compiling in debug mode, initialize it to zero. But it's also ok to have it be whatever was already there in memory. Basically - don't rely on either behavior.

UPDATE: As per the comments - global variables will be zero-initialized. Local variables will be whatever.

To answer your second question:

Thanks - following on from this then, is there a shortcut to assign zero to all of the following?: float x1, x2, x3, x4, x5, y1, y2, y3, y4, y5

You could do

float x[5] = {0,0,0,0,0}; float y[5] = {0,0,0,0,0};

and use x[0] instead of x1.

Claviform answered 17/5, 2011 at 14:50 Comment(1)
it is sufficient to write float x[5] = {0};Eustacia
M
3

Using the value of any variable prior to initialization (note that static-storage-duration objects are always initialized, so this only applies to automatic storage duration) results in undefined behavior. This is very different from containing 0 as the initial value or containing a random value. UB means it's possible that anything could happen. On implementations with trap bits it might crash your program or generate a signal. It's also possible that multiple reads result in different unpredictable values, among any other imaginable (or unimaginable) behavior. Simply do not use the value of uninitialized variables.

Note: The following was edited based on comments:

Note that code like this is invalid unless you can assure that the type foo_t does not have padding bits:

foo_t x;
int i;
for (i=0; i<N; i++) x = (x<<1) | get_bit();

Even though the intent is that the "random value" initially in x gets discarded before the loop ends, the program may invoke UB as soon as it accesses x to perform the operation x<<1 on the first iteration, and thus the entire program output is invalidated.

Moonlight answered 17/5, 2011 at 15:19 Comment(12)
what do you mean on implementations with trap bits it might crash your program? can you go into more detail? what value can possibly be put into x (one of 256) that would crash the program when it was being read? i'd understand if x was memory that didn't belong to the program, but it does since it has been allocated on the stack.Claviform
The example code and the trap bits remark are not connected. uintN_t types are not allowed to have padding/trap bits. Trap bits are just one possible example of the mechanism/rationale for UB; the impossibility of trap bits with uint8_t does not change the fact that you're invoking UB.Moonlight
@R.: then why is the entire program output invalidated for the example you gave?Claviform
Because it invokes undefined behavior.Moonlight
@R.: how, exactly? what can possibly happen?Claviform
UB does not have a "how". UB means the C language does not specify any behavior for the code and allows anything to happen. You are not allowed to use the value of uninitialized variables in C.Moonlight
@R.: i mean it just seems that the only undefined behavior allowable here is that x can contain any value. it's not that the code can do anything - it's that the value of an uninitialized variable can be anything. that being said, any possible value x can have in your example will still result in the code working as intended. can you show me where in the spec it says code can literally have any behavior if an uninitialized variable is encountered? all i see is "If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate."Claviform
OK, I think you may be right here. 6.2.6.1(5) states that accessing a trap representation is UB, but for types which cannot have trap representations (because they cannot have padding bits), I think you're guaranteed to have simply an indeterminate value that doesn't invoke UB.Moonlight
@R..: but is what you said about indeterminate values still correct, that they could e.g. vary over time? An OS/language that I used to work on defined no padding bits allowed, but nevertheless if unsigned int x is uninitialized, then 2*x is not guaranteed to be even and 0*x not guaranteed to be 0 - indeterminism was contagious through arithmetic. IIRC that was out of paranoia how bad indeterminate values might be, depending how the C standard was interpreted, rather than certain knowledge that they were allowed to be, since C was one language in which that system might be implemented.Lattie
Since reading a trap representation has UB, if you happened to do this, reading different values each type you read it would be one possible manifestation of UB.. :-)Moonlight
@R..: Sorry, for non-trap types I mean. J.2/1 is quite clear that it's UB if "the value of an object with automatic storage duration is used while it is indeterminate". Not just if it happens to be a trap value, any situation where it's indeterminate. But annex J is informative, not normative, and the sections it references don't seem to really back it up. So I'm uncertain. I wouldn't knowingly write a compiler that did it, but I'm also not sure I can write code that uses indeterminate values even of non-trapping types.Lattie
A lot of things in J.2 seem to lack backing elsewhere in the standard. The first one that comes to mind is the 2D array subscripts one. Also note that in this particular example I mention, whether it's UB may depend on whether the type is a character type (which can alias anything) or not. I seem to recall there being a special treatment for character types somewhere in the text about indeterminate values, too..Moonlight
W
2

It can be compiler specific but generally release builds don't initialise variables to any particular value, so you get whatever is left in memory. Certain magic numbers are used in debug builds by some compilers to mark specific areas of memory however.

Wilbourn answered 17/5, 2011 at 14:51 Comment(3)
"can be compiler specific"? That's not quite what the standard says on the subject...Gravimeter
Well, it's undefined. But MSVC uses certain magic numbers to mark regions of memory in debug builds.Wilbourn
The value is not guaranteed to be what was in the memory before the variable was created.Bought
D
0

C++ does not instantiate variables. The value of x is whatever happened to be in the memory at the time. Never assume anything about its initial value.

Decani answered 17/5, 2011 at 14:50 Comment(3)
This is only true for local variables. Global variables are zero-initialized.Metzler
It's not even guaranteed that x has a value at all. It's quite possible that the implementation puts x on the stack, and only grows the stack when x is assigned to. That means there's no memory allocated for x until the first write to it.Holna
The value is not guaranteed to be what was in the memory before the variable was created.Bought
O
0

I am just learning about this and using short example I find out that this is possible related to compiler version. In learning materials it clearly say that global variable are by default set to zero, and local variable are using current value from that memory location and it s good practice to initialize variable to zero. Then I tried this code

#include <iostream>

using namespace std;

int rezultat;

main()
{
    int localVariabla1;

    int nr,result;
    cout << nr << endl;
    cout << result << endl;

    for (int i = 0; i<3; i++)
    {
        cout << "Enter number" << endl;
        cin >> nr;
        result += nr;
    }
cout << "Result is : " << result;

} 

After I compile this code resulting values for all "printed" values are zero, before entering into a if conditions and result is correct.

Overnight answered 15/1, 2022 at 6:47 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.