I am seeing LTO optimize some global objects out from a TU if there are no functions in that TU that are being explicitly from another TU.
The following excerpt attempts to describe the key classes and files involved (please note that it's just for demonstration purposes and may not be completely accurate in all the places):
I have a singleton class Registrar
that maintains a list of all the objects of type Foo
that have been constructed. To avoid the static order of construction fiasco, I dynamically construct the instance of this object when the first object of type Foo has been constructed.
// Registrar.hpp
class Registrar
{
public:
static Registrar * sRegistrar;
std::vector<Foo *> objectList;
Registrar() = default;
};
Next, we have the class Foo
. This class's instances register with Registrar
as noted above.
// Foo.hpp
class Foo
{
public:
Foo()
{
if (Registrar::sRegistrar == nullptr)
Registrar::sRegistrar = new Registrar();
Registrar::sRegistrar->objectList.push_back(this);
}
};
The instances of Foo
are globals that may be created from several files. In one such file, we happen to have another function defined that gets called from elsewhere:
// file1.hpp
void someFunctionThatIsCalledExplicitly()
{
doSomething();
}
namespace
{
__attribute__((used, retain))
Foo f1;
}
But in another file, we just have an instance of Foo
being created:
// file2.hpp
namespace
{
__attribute__((used, retain))
Foo f2;
}
What I am seeing is that f2
is getting optimized out, while f1
is not, this is despite adding __attribute__((used, retain))
for all declarations of class Foo
.
How should I prevent LTO from optimizing out these instances? Why are the attributes making no difference?
EDIT: I was able to write a small example to reproduce said issue.
- main.cpp:
#include <iostream>
#include "Registrar.hpp"
#ifdef FORCE_LINKAGE
extern int i;
#endif
extern void someFunctionThatIsCalledExplicitly();
int main()
{
#ifdef FORCE_LINKAGE
i++;
#endif
someFunctionThatIsCalledExplicitly();
if (Registrar::sRegistrar == nullptr)
{
std::cout << "No instances of foo";
}
else
{
std::cout << Registrar::sRegistrar->objectList.size() << " instances of foo\n";
}
return 0;
}
- Foo.hpp
#pragma once
class Foo
{
public:
Foo();
};
- Foo.cpp:
#include "Foo.hpp"
#include "Registrar.hpp"
Foo::Foo()
{
if (Registrar::sRegistrar == nullptr)
{
Registrar::sRegistrar = new Registrar();
}
Registrar::sRegistrar->objectList.push_back(this);
}
- Registrar.hpp:
#pragma once
#include <vector>
#include "Foo.hpp"
class Registrar
{
public:
static Registrar * sRegistrar;
std::vector<Foo *> objectList;
Registrar() = default;
};
- Registrar.cpp:
#include "Registrar.hpp"
Registrar * Registrar::sRegistrar = nullptr;
- File1.cpp:
#include <iostream>
#include "Foo.hpp"
void someFunctionThatIsCalledExplicitly()
{
std::cout << "someFunctionThatIsCalledExplicitly() called\n";
}
namespace
{
__attribute__((used, retain))
Foo f1;
}
- File2.cpp:
#include "Foo.hpp"
#ifdef FORCE_LINKAGE
int i = 0;
#endif
namespace
{
__attribute__((used, retain))
Foo f2;
}
- Makefile:
CC = clang++
LIBTOOL = libtool
BUILDDIR = build
BINFILE = lto
BUILDFLAGS = -flto -std=c++17
LINKFLAGS = -flto
.PHONY: all
all: $(BUILDDIR) $(BINFILE)
.PHONY: force
force: def all
.PHONY: def
def:
$(eval BUILDFLAGS += -DFORCE_LINKAGE)
$(BINFILE): foo files
$(CC) -o $(BUILDDIR)/$@ $(LINKFLAGS) -L$(BUILDDIR) $(addprefix -l, $^)
foo: Foo.o main.o Registrar.o
$(LIBTOOL) $(STATIC) -o $(BUILDDIR)/[email protected] $(addprefix $(BUILDDIR)/, $^)
files: File1.o File2.o
$(LIBTOOL) $(STATIC) -o $(BUILDDIR)/[email protected] $(addprefix $(BUILDDIR)/, $^)
%.o: %.cpp
$(CC) $(BUILDFLAGS) -c -o $(addprefix $(BUILDDIR)/, $@) $<
.PHONY: $(BUILDDIR)
$(BUILDDIR):
mkdir -p $(BUILDDIR)
.PHONY: clean
clean:
rm -rf $(BUILDDIR)
I have two variants, one which is similar to above (I only see 1 instance) and another where I force linkage by declaring a global variable that I refer to elsewhere (here I see both instances):
$ make
$ ./build/lto
someFunctionThatIsCalledExplicitly() called
1 instances of foo
$ make force
$ ./build/lto
someFunctionThatIsCalledExplicitly() called
2 instances of foo
Foo
is used afterwards by iterating through the vector maintained by theRegistrar
instance. – Variolef2
should run and it has side effects. So it shouldn't be able to omit it. I don't think any of the rules that allow construction to be elided, like RVO, apply here. – Flounce-o
option on my system. What versions of clang and other tools do you use, btw? – Warta