Is there a way to force c++ compiler to not optimize out specific static objects in a static library?
Asked Answered
P

2

5

(Only needs to work for gcc 5.4, if a general solution can't be found)

I have a generic factory that I use to construct objects based on some key (like a string representing a class name). The factory must allow classes to register that may not be known at construction time (so I can't simply register a list of classes explicitly).

As a means of registering these keys and their associated constructors, I have another 'RegisterInFactory' (templated) class. In each class's source file, I construct an object in an anonymous namespace corresponding to that class. This way, each class is automatically registered to the factory once the global objects are constructed. These objects never get used or referenced outside of doing this initial registration task.

However, when the code is compiled into a static library, when that library is linked into an executable, these static objects never get constructed, so the classes don't register to the factory, and the factory can't create anything.

I'm aware of the -Wl,--whole-archive -lfoo flag, which does include these global objects. But it also introduces a lot of 'multiple definition' errors. I'm aware that there's another flag that I can turn off the multiple definition errors, but I don't feel comfortable going without those errors. I'm aware of -u symbolName to turn off specific symbol names from these multiple definition errors (at least that's what I think it does). However, there are just too many of these redundant functions for that to be realistic (mostly from protobuf classes).

Is there any way to tell the compiler not to optimize those objects out, but only those objects so I can avoid the multiple definition issue? Is there another pattern I might be able to follow that fits within the constraints? (Particularly that I do not know at compile time what classes may be registered to the factory.)

Simplified Example code: Factory.h:

template<Base>
class Factory{
  ...
  template<Derived>
  class RegisterInFactory{
    RegisterInFactory(){
      instance().regInFactory(derivedConstructorFunctional);
    }
  };
};

In Derived.cpp:

namespace{ BaseFactory::RegisterInFactory<Derived> registerMe{"Derived"}; }

Final note: I've gotten lucky to some degree where without the linker flags, they still get included, but the only way that seems to happen is if the Derived class is 'sufficiently' complicated. Or maybe it's if I use the Derived class directly within the linked executable. I can't really tell why it's worked when it has.

Preconcert answered 8/11, 2017 at 17:48 Comment(4)
I think the issue is that static in namespace scope are not constructed until right before some code in that cpp file is used. Ergo, if no code in a cpp file is ever called, those statics may never be constructed. This has nothing to do with Gcc optimizations, this is how C++ works. MSVC is an oddball that ignores this and constructs all namespace statics before starting main.Brawl
If you reference some symbol in a library module, it will be linked in (including the static objects). If there is no reference to anything in a module, it is not included. That's the way it is supposed to work. This has been asked several times before #14116920Borrow
I've browsed through the several other answers as well. My question is if there is a way to force specific objects to be included, as opposed to the whole-archive option, which is not only excessive, but problematic in its excess. @BasileStarynkevitch: That may be what I'm looking for, I'll dig into it to see if it fits.Preconcert
If you simply want to ignore the multiple definition errors, you can use -Wl,--allow-multiple-definitionKvass
D
5

The issue is not related to optimizations. Rather how linkers link symbols from static libraries.

However, when the code is compiled into a static library, when that library is linked into an executable, these static objects never get constructed, so the classes don't register to the factory, and the factory can't create anything.

That happens because nothing else refers to that registration variable. Hence, the linker is not pulling in the definition of the symbol from the archive.

To tell a Unix linker to keep that registration variable even if nothing else refers to it, use -Wl,--undefined=<symbol> compiler switch when linking to that static library:

-u symbol

--undefined=symbol

Force symbol to be entered in the output file as an undefined symbol. Doing this may, for example, trigger linking of additional modules from standard libraries. -u may be repeated with different option arguments to enter additional undefined symbols.

If that registration variable has "C" linkage, then <symbol> is the variable name.

For C++ linkage you will need to lookup the mangled name using nm --defined-only <object-file>. You may also need to put that variable into a named namespace, so that it has external linkage.


Example:

[max@supernova:~/src/test] $ cat mylib.cc
#include <cstdio>

namespace mylib {

struct Register
{
    Register() { std::printf("%s\n", __PRETTY_FUNCTION__); }
};

Register register_me;

}

[max@supernova:~/src/test] $ cat test.cc
#include <iostream>

int main() {
    std::cout << "Hello, world!\n";
}

[max@supernova:~/src/test] $ make
mkdir /home/max/src/test/debug
g++ -c -o /home/max/src/test/debug/test.o -MD -MP -std=gnu++14 -march=native -pthread -W{all,extra,error,inline} -ggdb -fmessage-length=0 -Og test.cc
g++ -c -o /home/max/src/test/debug/mylib.o -MD -MP -std=gnu++14 -march=native -pthread -W{all,extra,error,inline} -ggdb -fmessage-length=0 -Og mylib.cc
ar rcsT /home/max/src/test/debug/libmylib.a /home/max/src/test/debug/mylib.o
g++ -o /home/max/src/test/debug/test -ggdb -pthread /home/max/src/test/debug/test.o /home/max/src/test/debug/libmylib.a

[max@supernova:~/src/test] $ ./debug/test 
Hello, world! <-------- Missing output from mylib::register_me.

[max@supernova:~/src/test] $ nm --defined-only -C debug/mylib.o
0000000000000044 t _GLOBAL__sub_I__ZN5mylib11register_meE
0000000000000000 t __static_initialization_and_destruction_0(int, int)
0000000000000000 B mylib::register_me                        <-------- Need a mangled name for this.
0000000000000000 r mylib::Register::Register()::__PRETTY_FUNCTION__

[max@supernova:~/src/test] $ nm --defined-only debug/mylib.o
0000000000000044 t _GLOBAL__sub_I__ZN5mylib11register_meE
0000000000000000 t _Z41__static_initialization_and_destruction_0ii
0000000000000000 B _ZN5mylib11register_meE                   <-------- The mangled name for that.
0000000000000000 r _ZZN5mylib8RegisterC4EvE19__PRETTY_FUNCTION__

# Added -Wl,--undefined=_ZN5mylib11register_meE to Makefile.
[max@supernova:~/src/test] $ make 
g++ -o /home/max/src/test/debug/test -ggdb -pthread -Wl,--undefined=_ZN5mylib11register_meE /home/max/src/test/debug/test.o /home/max/src/test/debug/libmylib.a

[max@supernova:~/src/test] $ ./debug/test 
mylib::Register::Register() <-------- Output from mylib::register_me as expected.
Hello, world!
Disengagement answered 8/11, 2017 at 18:8 Comment(7)
+1 for this approach. This works beautifully and is (IMHO) much less error prone than the whole-archive approach.Coed
I completely misunderstood the undefined symbol flag, then. Testing this approach nowPreconcert
@Coed It works reliably, but the downside is that you need to specify that linker switch when linking that archive, which can be overlooked.Disengagement
I added -Wl,--undefined=_ZN7animals12_GLOBAL__N_110registerMeE as a flag to the linker in my test program (with animals::[anonymous namespace]::RegisterInFactory<Cat> registerMe{"cat"}; as the object, for example). That didn't seem to include the object. I moved the registrar out of the anonymous namespace --undefined=_ZN7animals10registerMeB5cxx11E, and still didn't seem to work. I may need to study this undefined flag more to know what's wrong.Preconcert
@MaximEgorushkin Thanks so much! I look forward to experimenting with this morePreconcert
This trick only works if the symbols are global (i.e. not static) and show up with B in the nm output (as opposed to b).Kvass
@Kvass It is not a trick, rather how linkage works. Static means internal linkage and such symbols aren't accessible from different translation units.Disengagement
I
0

I ran into this same problem, and was able to resolve it in the code to get a similar effect. The gist of the solution is:

In Derived.hpp:

class Derived
{
private:
    static const BaseFactory::RegisterInFactory<Derived> registerMe;
};

inline const BaseFactory::RegisterInFactory<Derived> Derived::registerMe{"Derived"};

You're still unable to access registerMe outside the class, and at least in my case, I got past the problem of the registration object not getting constructed.

Isotone answered 5/6 at 13:13 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.