How can I use explicit template instantiation for template member functions defined within a class definition?
Asked Answered
S

2

8

In an effort to reduce compilation times in a large project that makes liberal use of templates, I've had good results using "extern template" (explicit template instantiation) to prevent common template functions from being defined in many different compilation units.

However, one annoying thing about it is that it doesn't work for member functions defined within the class definition.

For example, I have the following template class:

template <typename T>
struct Foo
{
    static T doubleIt(T input)
    {
        return input * 2;
    }
};

Now, I know that Foo is most commonly used for numeric types, so I add this to the header:

extern template struct Foo<int>;
extern template struct Foo<float>;
extern template struct Foo<double>;

And in a cpp file, add explicit instantiations:

template struct Foo<int>;
template struct Foo<float>;
template struct Foo<double>;

This does not work, as dumpbin.exe on the obj file tells me:

017 00000000 SECT4  notype ()    External     | ?doubleIt@?$Foo@M@@SAMM@Z (public: static float __cdecl Foo<float>::doubleIt(float))

If I change my class definition to define the function outside the class header like so it works correctly:

template <typename T>
struct Foo
{
    static T doubleIt(T input);
};

template <typename T>
T Foo::doubleIt(T input)
{
    return input * 2;
}

Which we can verify using dumpbin:

017 00000000 UNDEF  notype ()    External     | ?doubleIt@?$Foo@M@@SAMM@Z (public: static float __cdecl Foo<float>::doubleIt(float))

The problem with that solution is that it is a lot of typing to move all the function definitions outside of the class definition, especially when you get more template parameters.

I've tried using declspec(__noinline) but it still doesn't extern the functions correctly (and preventing the inlining of the function where possible is undesirable).

One thing that works is to enumerate each function individually, like so, but that of course is even more cumbersome:

extern template int Foo<int>::doubleIt(int);
extern template float Foo<float>::doubleIt(float);
extern template double Foo<double>::doubleIt(double);

What I would like is a way to keep the function definition inside of the class definition, while still allowing the function to be inlined where possible, but when it is not inlined, only creating it in the compilation unit where it is explicitly instantiated (in other words, exactly the same behavior as moving the function outside of the class definition).

Soria answered 1/3, 2018 at 10:39 Comment(0)
B
1

You can't have it both ways, in order to inline the method the compiler needs to use the source code, as the method is defined inline the compiler doesn't bother compiling it into an object file if it isn't used directly in that object (and even if it is if its inlined in all cases it wont be present in the object as a separate method). The compiler will always have to build your function if its defined in the header, somehow forcing the compiler to store a copy of that function in the object file wont improve performance.

Brain answered 1/3, 2018 at 11:46 Comment(1)
I think you misunderstood the question. When the compiler can inline the function, great, let it do so. However, when the compiler can't inline the function, I don't want it to make 200 copies in each of the .cpp files. I get EXACTLY the behavior I want when I move the code to the header, but outside of the class definition. There is no reason that there couldn't be a keyword or something else to make a function defined inside the class definition behave as though it were defined outside.Soria
H
-1

As has been pointed out, you cannot have both extern and inlining, but about the extra typing part, I did something like that and tried to minimize it using the preprocessor. I'm not sure if you'd find that useful, but just in case, I'll put an example with a template class that has a template function inside.

File Foo.h:

template<typename T1>
struct Foo
{
    void bar(T1 input)
    {
        // ...
    }

    template<typename T2>
    void baz(T1 input1, T2 input2);
};
#include <Foo.inl>

File Foo.cc:

template<typename T1>
template<typename T2>
void Foo<T1>::baz(T1 input1, T2 input2)
{
    // ...
}
#define __FOO_IMPL
#include <Foo.inl>
#undef __FOO_IMPL

File Foo.inl:

#ifdef __FOO_IMPL
#define __FOO_EXTERN
#else
#define __FOO_EXTERN extern
#endif

#define __FOO_BAZ_INST(T1, T2) \
    __FOO_EXTERN template void Foo<T1>::baz<T2>(T1, T2);

#define __FOO_INST(T1) \
    __FOO_EXTERN template struct Foo<T1>; \
    __FOO_BAZ_INST(T1, int) \
    __FOO_BAZ_INST(T1, float) \
    __FOO_BAZ_INST(T1, double) \

__FOO_INST(int)
__FOO_INST(float)
__FOO_INST(double)

#undef __FOO_INST
#undef __FOO_BAZ_INST
#undef __FOO_EXTERN

So it is still quite some writing, but at least you don't have to be careful to keep in sync to different sets of template declarations, and you don't have to explicitly go through every possible combination of types. In my case, I had a class template with two type parameters and with a couple of member function templates with an extra type parameter, and each of them could take one in 12 possible types. 36 lines is better than 123 = 1728, although I would have preferred the preprocessor to somehow iterate through the list of types for each parameter, but couldn't work out how.

As a side note, in my case I was compiling a DLL where I needed all the templates to be compiled, so actually the template instantiations/declarations looked more like __FOO_EXTERN template __FOO_API ....

Hiedihiemal answered 1/3, 2018 at 12:33 Comment(1)
It looks like your solution was just to avoid having to repeat yourself when enumerating the list of types. In my case, the list of types is small, and easy to keep in sync (and the compilers notices if you screw up). Thanks for the reply, but I don't think it's worth all the macro boilerplate in my case.Soria

© 2022 - 2024 — McMap. All rights reserved.