C++ static member variable and its initialization

Asked 28/12, 2010 at 16:46 Answered 28/12, 2010 at 17:40

Solved c++initialization static-variables

For static member variables in C++ class - the initialization is done outside the class. I wonder why? Any logical reasoning/constraint for this? Or is it purely legacy implementation - which the standard does not want to correct?

I think having initialization in the class is more "intuitive" and less confusing.It also gives the sense of both static and global-ness of the variable. For example if you see the static const member.

Doggy answered 28/12, 2010 at 16:46 Comment(0)

Fundamentally this is because static members must be defined in exactly one translation unit, in order to not violate the One-Definition Rule. If the language were to allow something like:

struct Gizmo
{
  static string name = "Foo";
};

then name would be defined in each translation unit that #includes this header file.

C++ does allow you to define integral static members within the declaration, but you still have to include a definition within a single translation unit, but this is just a shortcut, or syntactic sugar. So, this is allowed:

struct Gizmo
{
  static const int count = 42;
};

So long as a) the expression is const integral or enumeration type, b) the expression can be evaluated at compile-time, and c) there is still a definition somewhere that doesn't violate the one definition rule:

file: gizmo.cpp

#include "gizmo.h"

const int Gizmo::count;

Furthermost answered 28/12, 2010 at 17:2 Comment(3)

The One Definition Rule is: "No translation unit shall contain more than one definition of any variable, function, class type, enumeration type or template". If your first Gizmo example were legal, I do not think that it would violate the One Definition Rule because each translation unit would have a single definition of Gizmo::name. – Invalidity 28/12, 2010 at 18:26

@Daniel Trebbien: That's not the whole ODR. That's just 3.2/1 - a first rough blanket "layer" of ODR (to take care of the most obvious violations). The full ODR has a more detailed set of requirements for each kind of entity. For external-linkage objects (as well as external-linkage functions) ODR is further restricted in 3.2/3 to one and only definition for the entire program. – Maguire 28/12, 2010 at 19:49

@Daniel Trebbien: The reason the requirement of 3.2/1 was separated from the rest is that the violation of 3.2/1 requires diagnostic from the compiler, while for violations of 3.2/3 no diagnostic is required. – Maguire 28/12, 2010 at 19:56

In C++ since the beginning of times the presence of an initializer was an exclusive attribute of object definition, i.e. a declaration with an initializer is always a definition (almost always).

As you must know, each external object used in C++ program has to be defined once and only once in only one translation unit. Allowing in-class initializers for static objects would immediately go against this convention: the initializers would go into header files (where class definitions usually reside) and thus generate multiple definitions of the same static object (one for each translation unit that includes the header file). This is, of course, unacceptable. For this reason, the declaration approach for static class members is left perfectly "traditional": you only declare it in the header file (i.e. no initializer allowed), and then you define it in a translation unit of your choice (possibly with an initializer).

One exception from this rule was made for const static class members of integral or enum types, because such entries can for Integral Constant Expressions (ICEs). The main idea of ICEs is that they are evaluated at compile time and thus do no depend on definitions of the objects involved. Which is why this exception was possible for integral or enum types. But for other types it would just contradict the basic declaration/definition principles of C++.

Maguire answered 28/12, 2010 at 17:22 Comment(0)

It's because of the way the code is compiled. If you were to initialize it in the class, which often is in the header, every time the header is included you'd get an instance of the static variable. This is definitely not the intent. Having it initialized outside the class gives you the possibility to initialize it in the cpp file.

Whitlow answered 28/12, 2010 at 16:52 Comment(7)

This is something that a modern compiler/linker combination could easily resolve, and not a good enough reason for such a cumbersome limitation. – Emancipated 28/12, 2010 at 16:59

@Emancipated is right. A C++ linker is able to resolve multiple definitions of member functions, so why not static member variables? That's what OP is asking, I think. – Invalidity 28/12, 2010 at 17:8

I guess only modern C++ linkers could resolve multiple definitions of methods (member functions). (I.e. the last time I tried to have multiple definitions of a method was years ago and the link failed.) Prior to that, all methods defined in the header needed to be inline or static, and the latter results in multiple copies in the linked file. – Longshore 28/12, 2010 at 17:13

@Daniel: "why not static member variables" because the compiler wouldn't know which translation unit to put the definition in. – Furthermost 28/12, 2010 at 17:16

@Daniel: It's not a problem in the case of multiple definition for member functions because those member functions get multiple definitions. Albiet still one definition per translation unit, but each translation unit uses a different definition. A requirement of statics is that one definition is used by all translation units. – Furthermost 28/12, 2010 at 17:18

@kumar: Yes, that is a possibility. The definition still has to go somewhere, and there can be only one copy for the whole program, else it wouldn't be static. So if you don't tell the compiler which TU to put the definition in, it won't know where to put it. It might try to put it in every TU (which would violate ODR), or just barf an error message saying you need to tell it where to define it (which is what it actually does). – Furthermost 28/12, 2010 at 17:22

@John: While I agree that each translation unit has its own definition of every member function that is defined in the header, the linker must still combine the multiple definitions because each definition has the same symbol. Therefore, the same definition is used by all translation units after linking is completed. – Invalidity 28/12, 2010 at 18:21

Section 9.4.2, Static data members, of the C++ standard states:

If a static data member is of const integral or const enumeration type, its declaration in the class definition can specify a const-initializer which shall be an integral constant expression.

Therefore, it is possible for the value of a static data member to be included "within the class" (by which I presume that you mean within the declaration of the class). However, the type of the static data member must be a const integral or const enumeration type. The reason why the values of static data members of other types cannot be specified within the class declaration is that non-trivial initialization is likely required (that is, a constructor needs to run).

Imagine if the following were legal:

// my_class.hpp
#include <string>

class my_class
{
public:
  static std::string str = "static std::string";
//...

Each object file corresponding to CPP files that include this header would not only have a copy of the storage space for my_class::str (consisting of sizeof(std::string) bytes), but also a "ctor section" that calls the std::string constructor taking a C-string. Each copy of the storage space for my_class::str would be identified by a common label, so a linker could theoretically merge all copies of the storage space into a single one. However, a linker would not be able to isolate all copies of the constructor code within the object files' ctor sections. It would be like asking the linker to remove all of the code to initialize str in the compilation of the following:

std::map<std::string, std::string> map;
std::vector<int> vec;
std::string str = "test";
int c = 99;
my_class mc;
std::string str2 = "test2";

EDIT It is instructive to look at the assembler output of g++ for the following code:

// SO4547660.cpp
#include <string>

class my_class
{
public:
    static std::string str;
};

std::string my_class::str = "static std::string";

The assembly code can be obtained by executing:

g++ -S SO4547660.cpp

Looking through the SO4547660.s file that g++ generates, you can see that there is a lot of code for such a small source file.

__ZN8my_class3strE is the label of the storage space for my_class::str. There is also the assembly source of a __static_initialization_and_destruction_0(int, int) function, which has the label __Z41__static_initialization_and_destruction_0ii. That function is special to g++ but just know that g++ will make sure that it gets called before any non-initializer code gets executed. Notice that the implementation of this function calls __ZNSsC1EPKcRKSaIcE. This is the mangled symbol for std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&).

Going back to the hypothetical example above and using these details, each object file corresponding to a CPP file that includes my_class.hpp would have the label __ZN8my_class3strE for sizeof(std::string) bytes as well as assembly code to call __ZNSsC1EPKcRKSaIcE within its implementation of the __static_initialization_and_destruction_0(int, int) function. The linker can easily merge all occurrences of __ZN8my_class3strE, but it cannot possibly isolate the code that calls __ZNSsC1EPKcRKSaIcE within the object file's implementation of __static_initialization_and_destruction_0(int, int).

Invalidity answered 28/12, 2010 at 17:40 Comment(4)

Why then is the following not allowed: class my_class { public: static const double pi = 3.14; }; – Furthermost 28/12, 2010 at 17:53

@John: I think that it should be allowed for the same reason why the values of static data members of const integer or const enumeration type can be specified with the declaration. I don't know why it isn't. – Invalidity 28/12, 2010 at 18:6

This suggests to me that "non-trivial" initialization may not be the one and only reason why it's not allowed for non-integral types. – Furthermost 28/12, 2010 at 18:16

@John: I think that I know why const double and const float "are not supported". If these types were supported, then the C++ compiler would have to be able to evaluate "floating-point constant expressions". For example, static const int i = 44 << 6 ^ 0x63ab9900; is allowed, so the compiler has to be able to evaluate constant integral expressions. If static const float f = 24.382f * -999.283f were also allowed, then the C++ compiler would have to have functions to calculate floating-point arithmetic. This might have been seen by the C++ committee as an unnecessary complication. – Invalidity 30/12, 2010 at 15:7

I think the main reason to have initialization done outside the class block is to allow for initialization with return values of other class member functions. If you wanted to intialize a::var with b::some_static_fn() you'd need to make sure that every .cpp file that includes a.h includes b.h first. It'd be a mess, especially when (sooner or later) you run into a circular reference that you could only resolve with an otherwise unnecessary interface. The same issue is the main reason for having class member function implementations in a .cpp file instead of putting everything in your main class' .h.

At least with member functions you do have the option to implement them in the header. With variables you must do the initialization in a .cpp file. I don't quite agree with the limitation, and I don't think there's a good reason for it either.

Emancipated answered 28/12, 2010 at 16:59 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags