LTO optimizing out global variables
Asked Answered
V

1

7

I am seeing LTO optimize some global objects out from a TU if there are no functions in that TU that are being explicitly from another TU.

The following excerpt attempts to describe the key classes and files involved (please note that it's just for demonstration purposes and may not be completely accurate in all the places):

I have a singleton class Registrar that maintains a list of all the objects of type Foo that have been constructed. To avoid the static order of construction fiasco, I dynamically construct the instance of this object when the first object of type Foo has been constructed.

// Registrar.hpp
class Registrar
{
public:
  static Registrar * sRegistrar;
  std::vector<Foo *> objectList;
  Registrar() = default;
};

Next, we have the class Foo. This class's instances register with Registrar as noted above.

// Foo.hpp
class Foo
{
public:
  Foo()
  {
    if (Registrar::sRegistrar == nullptr)
      Registrar::sRegistrar = new Registrar();

    Registrar::sRegistrar->objectList.push_back(this);
  }
};

The instances of Foo are globals that may be created from several files. In one such file, we happen to have another function defined that gets called from elsewhere:

// file1.hpp
void someFunctionThatIsCalledExplicitly()
{
  doSomething();
}

namespace 
{
  __attribute__((used, retain))
  Foo f1;
}

But in another file, we just have an instance of Foo being created:

// file2.hpp
namespace 
{
  __attribute__((used, retain))
  Foo f2;
}

What I am seeing is that f2 is getting optimized out, while f1 is not, this is despite adding __attribute__((used, retain)) for all declarations of class Foo.

How should I prevent LTO from optimizing out these instances? Why are the attributes making no difference?

EDIT: I was able to write a small example to reproduce said issue.

  1. main.cpp:
#include <iostream>
#include "Registrar.hpp"

#ifdef FORCE_LINKAGE
extern int i;
#endif

extern void someFunctionThatIsCalledExplicitly();

int main()
{
    #ifdef FORCE_LINKAGE
    i++;
    #endif

    someFunctionThatIsCalledExplicitly();

    if (Registrar::sRegistrar == nullptr)
    {
        std::cout << "No instances of foo";
    }
    else
    {
        std::cout << Registrar::sRegistrar->objectList.size() << " instances of foo\n";
    }

    return 0;
}
  1. Foo.hpp
#pragma once

class Foo
{
public:
    Foo();
};
  1. Foo.cpp:
#include "Foo.hpp"
#include "Registrar.hpp"

Foo::Foo()
{
    if (Registrar::sRegistrar == nullptr)
    {
        Registrar::sRegistrar = new Registrar();
    }

    Registrar::sRegistrar->objectList.push_back(this);
}
  1. Registrar.hpp:
#pragma once

#include <vector>
#include "Foo.hpp"

class Registrar
{
public:
    static Registrar * sRegistrar;
    std::vector<Foo *> objectList;

    Registrar() = default;
};
  1. Registrar.cpp:
#include "Registrar.hpp"

Registrar * Registrar::sRegistrar = nullptr;
  1. File1.cpp:
#include <iostream>
#include "Foo.hpp"

void someFunctionThatIsCalledExplicitly()
{
    std::cout << "someFunctionThatIsCalledExplicitly() called\n";
}

namespace
{
    __attribute__((used, retain))
    Foo f1;
}
  1. File2.cpp:
#include "Foo.hpp"

#ifdef FORCE_LINKAGE
int i = 0;
#endif

namespace
{
  __attribute__((used, retain))
  Foo f2;
}
  1. Makefile:
CC          = clang++
LIBTOOL     = libtool
BUILDDIR    = build
BINFILE     = lto

BUILDFLAGS  = -flto -std=c++17
LINKFLAGS   = -flto

.PHONY:     all
all:        $(BUILDDIR) $(BINFILE)

.PHONY:     force
force:      def all

.PHONY:     def
def:
    $(eval BUILDFLAGS += -DFORCE_LINKAGE)

$(BINFILE): foo files
    $(CC) -o $(BUILDDIR)/$@ $(LINKFLAGS) -L$(BUILDDIR) $(addprefix -l, $^)

foo:        Foo.o main.o Registrar.o
    $(LIBTOOL) $(STATIC) -o $(BUILDDIR)/[email protected] $(addprefix $(BUILDDIR)/, $^)

files:  File1.o File2.o
    $(LIBTOOL) $(STATIC) -o $(BUILDDIR)/[email protected] $(addprefix $(BUILDDIR)/, $^)

%.o:        %.cpp
    $(CC) $(BUILDFLAGS) -c -o $(addprefix $(BUILDDIR)/, $@) $<

.PHONY:     $(BUILDDIR)
$(BUILDDIR):
    mkdir -p $(BUILDDIR)

.PHONY:     clean
clean:
    rm -rf $(BUILDDIR)

I have two variants, one which is similar to above (I only see 1 instance) and another where I force linkage by declaring a global variable that I refer to elsewhere (here I see both instances):

$ make
$ ./build/lto
someFunctionThatIsCalledExplicitly() called
1 instances of foo

$ make force
$ ./build/lto
someFunctionThatIsCalledExplicitly() called
2 instances of foo
Variole answered 12/7, 2022 at 20:55 Comment(11)
They don't appear to be used.Centreboard
They are not explicitly being used, yes, but shouldn't the attributes ensure that said objects are still not optimized? What I haven't shown here is that each instance of Foo is used afterwards by iterating through the vector maintained by the Registrar instance.Variole
Most likely what's happening is that in anonymous namespace any variable forced to has internal linkage, and linker is allowed to remove unused internal linkage variables.Yeseniayeshiva
But the constructor of f2 should run and it has side effects. So it shouldn't be able to omit it. I don't think any of the rules that allow construction to be elided, like RVO, apply here.Flounce
@NikitaKniazev, I tried taking the anonymous namespace out, but it had no effect.Variole
Could you show the actual MRE? Say, headers+sources? At this stage there's e.g. a non-inline function definition inside hpp file, same for variables. I may assume several omissions have been made here for the sake of simplification, but those things matter. Especially when LTO is involved.Warta
BTW, just guessing (I might be completely wrong), but aren't you hitting static initialization order fiasco?Warta
Constructors can be elided or optimized. Any side-effect in them may-or-may-not be execute, so the side-effects need to be benign in the situation that the constructor is not executed.Centreboard
@Warta unfortunately, I cannot show the actual source code. As for the static initialization order fiasco, it doesn't matter in what order f1 and f2 are created.Variole
It will be really hard to address this without any details, I did fiddle with your code and it works for me godbolt.org/z/9aq8xY1exWarta
Are you sure the libtool invocation is fine? It keeps complaining about incorret -o option on my system. What versions of clang and other tools do you use, btw?Warta
W
7

OK, I did some digging and the fact you're linking the .a library is the culprit here, not the LTO, neither any other optimization.

This had been brought up on SO before btw, see: Static initialization and destruction of a static library's globals not happening with g++

When linking the .o files (as I did on godbolt) everything goes in and it works.

For .a files only the referenced code is linked, the rest is not. Creating a dummy variable is one workaround, but the proper one is passing --whole-archive to the linker.

I could not run your makefile-based example due to issues with libtool, but have a look at my CMake config:

cmake_minimum_required(VERSION 3.18)
project(LINK)


set(CMAKE_CXX_STANDARD 17)
set(CMAKE_ARCHIVE_OUTPUT_DIRECTORY "${PROJECT_BINARY_DIR}")
set(CMAKE_LIBRARY_OUTPUT_DIRECTORY "${PROJECT_BINARY_DIR}")
set(CMAKE_RUNTIME_OUTPUT_DIRECTORY "${PROJECT_BINARY_DIR}")

add_library(Files File1.cpp File2.cpp)


target_include_directories(Files
                           INTERFACE ${CMAKE_CURRENT_SOURCE_DIR}
                           )
target_compile_definitions(Files PUBLIC ${FORCE})

add_executable(test Foo.cpp main.cpp Registrar.cpp)
# note the line below
target_link_libraries(test -Wl,--whole-archive Files -Wl,--no-whole-archive)
target_compile_definitions(test PUBLIC ${FORCE})

When linking it will invoke the command the more-less the following way:

g++ -o test -Wl, --whole-archive -l:libFiles.a -Wl, --no-whole-archive Foo.o Registrar.o main.o

Warta answered 14/7, 2022 at 23:37 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.