Why is the constructor of a global variable not called in a library?
Asked Answered
D

1

8

I have some legacy code with some singleton classes that register themselves using constructors of global variables. It's a large codebase, that's compiled into one executable. I have tried to organize the code base and regroup code into libraries. A minimal example of the current code is

main.cpp

int main(int argc, char *argv[])
{
  return 0;
} 

Hash.cpp

#include <iostream>

class Hash
{
public:
    Hash()
    {
        std::cout << "Hash\n";
    }
};

Hash a;

and the current build configuration is

CMakeLists.txt

cmake_minimum_required(VERSION 3.26)
project(mcve)

add_executable(mcve main.cpp Hash.cpp)

Building the code and running the executable prints

Hash

I've modified the build configuration to

cmake_minimum_required(VERSION 3.26)
project(mcve)

add_library(Hash Hash.cpp)
add_executable(mcve main.cpp)
target_link_libraries(mcve Hash)

This creates a static library libHash.a and links it to the executable. Compiling the same code and running the executable doesn't print anything. Why is the difference, and where is it described? Is it part of the C++ standard or of the compiler? Is it OS specific (Linux static libraries)? Is it undefined behavior?

Duffel answered 17/8, 2023 at 23:15 Comment(8)
You have no references to a in main(). I think the linker is optimizing it away and not loading that library.Indeed
@Indeed The question is, why the different behavior. Until today, I thought that static libraries are only archived *.o files and that linking static libraries is (almost) the same as compiling everything together. Could the optimizer remove it in the first configuration? This would break the legacy code. My current problem is, if this is a potential problem, I need secure source to prove the problem. It will cost much effort to fix it.Duffel
Libraries generally contain many functions and variables, most of which are not used by any particular application that links with them. So linkers are selective, they only load the ones that are needed by the main program.Indeed
@Indeed But the linker should be involved in both cases. In both cases, the compiler creates a Hash.cpp.o file and links it.Duffel
"Is it part of the C++ standard": It is implementation-defined whether initialization happens before main is entered or is delayed until the first non-initialization odr-use of a non-inline definition in the same translation unit as the one containing the variable definition. The behavior you see is just what compiler/linker chose for this implementation-definedness.Kirman
@Kirman But it's the same compiler (implementation) in both cases.Duffel
@Duffel But not the same linker configuration. Typically linkers do not link a library that isn't referenced, at least with some common options like -Wl,--as-needed which may be configured as default.Kirman
I think I wasn't quite right above. It doesn't select functions and variables. When linking with a library, the linker has to determine which object files in the library are needed and which aren't. It only loads the object files that are actually referenced. But when you link with individual object files, it loads them all.Indeed
I
10

The difference should be described in your linker's documentation, as well as introductory textbooks, I suppose, that explain what static libraries are, how they work, and how to use them.

Without static libraries in the picture: when translation units get explicitly linked together everything in them becomes a part of the executable.

In your first example both main.cpp and hash.cpp are linked into the mcve executable, and at startup the sole global object from hash.cpp gets constructed. The End.

Linking with a static library does not, I repeat, does not include everything from the static library into the executable. That's not how static libraries work. Only individual translation units in the static library that export symbols that are undefined in the translation units which link with the static library -- only those translation units get linked into the executable (it's a little bit more complicated, actually, but the full complexity is immaterial for the purposes of this question, would only confuse things, so we'll work with just this simplified description). That's a defining characteristic of what static libraries are.

A very, very close examination of the shown code results in a profound discovery that there are no undefined or unresolved symbols from main.cpp that are exported by hash.cpp. So hash.cpp does not get linked into executable. The global object is defined in hash.cpp so the final executable does not end up with any global object that needs to get constructed when the program runs. The End.

It's not that the rules for initialization or construction change for static libraries. It's because there's nothing to initialize or construct.

Ipomoea answered 18/8, 2023 at 1:18 Comment(5)
Part of my question, where to search for this. I checked my introductory textbooks and didn't find anything about it. It seems like most of them don't even mention this topic or don't describe the necessary details. Even in The C++ Programming Language (4th Edition) I didn't find it, but to be honest, I didn't read the whole 1000 pages in the last 7 hours. Maybe I've overlooked it. Now I know, it's linker specific and not described in the C++ standard.Duffel
Good question. I learned this before Stackoverflow, Reddit, and Twitter existed. The only possible place I could've learned it from was my textbook, and man pages. I have no specific recollection of where I picked this up from. But there are simply no other possibilities. The knowledge didn't just pop into my head. It must've been a part of my studies. There's simply no other possibility. But these days, the infatuation with finding a magical shortcut to learning, by doing a bunch of dumb coding puzzles, I guess, must translate to a dearth of good textbooks.Ipomoea
I guess it was a man page, documentation or text related to linking or the linker. That's a topic I have to invest much more time into.Duffel
@Duffel I'm not sure this would even be covered in a book about the language. Libraries are not a language feature.Indeed
Indeed, this is not related to C++ at all, it is how static libraries work in a Posix environment. For example (kinda poorly worded though): "When a program is linked against a static library, the machine code from the object files for any external functions used by the program is copied from the library into the final executable." linuxtopia.org/online_books/an_introduction_to_gcc/…Danger

© 2022 - 2024 — McMap. All rights reserved.