In my program I have some code which looks something like this: A collection of functions or classes which uniquely implement a common template via template specialization:
constexpr int NUM_SPECIALIZATIONS = 32;
template<int num>
void print_num(){}
template<>
void print_num<0>(){cout << "Zero" << endl;}
template<>
void print_num<1>(){cout << "One" << endl;}
template<>
void print_num<2>(){cout << "Two" << endl;}
// etc, ...
template<>
void print_num<31>(){cout << "Thirty-One" << endl;}
Then I have a variable whose value is only known at run time:
int my_num;
cin >> my_num; // Could be 0, could be 2, could be 27, who tf knows
Then I need to call the template specialization which corresponds to the value of the variable. Since I can't use a variable as a template parameter, I would create a sort of "interpreter":
switch(my_num)
{
case 0:
print_num<0>();
break;
case 1:
print_num<1>();
break;
case 2:
print_num<2>();
break;
// etc, ...
case 31:
print_num<31>();
break;
}
The first thing I notice about this code is that it is repetitive. Surely there must be some sort of trick to procedurally generate this code.
The other thing I notice is that it is inconvenient to maintain, since it is coupled with the template specializations; Each time I want to add a new template specialization, I need to update the interpreter as well.
Ideally, I would be able to use some sort of template magic to generate the interpreter automatically at compile time so that both sections of code can remain decoupled while I maintain the efficiency of the switch statement:
// Copies and pastes the code found in template lambda "foo",
// Replacing all occurrences of its template parameter with values from
// "begin" until "end"
template<auto begin, auto end>
inline void unroll(auto foo)
{
if constexpr(begin < end)
{
foo.template operator()<begin>();
unroll<begin + 1, end>(foo);
}
}
// A template lambda which generates a generic switch case for the interpreter
auto template_lambda = [&]<int NUM>()
{
case NUM:
print_num<NUM>();
break;
};
// The interpreter; contains the code "case NUM: print_num<NUM>(); break;"
// repeated for all ints NUM such that 0 <= NUM < NUM_SPECIALIZATIONS
switch(my_num)
{
unroll<0,NUM_SPECIALIZATIONS>(template_lambda);
}
Unfortunately this code doesn't compile. It never makes it past the syntax checker, since the "case" and "break" statements in my lambda function technically aren't inside the switch statement yet.
In order for this to work, I need to implement the "unroll" function using macros rather than templates and lambdas, so that the copying and pasting of source code happens before syntax checking rather than after.
An alternate solution which I've played around with is to mimic what the switch statement does on the low level. I can create an array of function pointers which serves as a jump table:
std::array<std::function<void()>,NUM_SPECIALIZATIONS> jump_table;
Then I can populate the jump table with pointers to the various template specializations without necessarily having to type them all out. Instead I can just use the unroll function:
template<auto begin, auto end>
inline void unroll(auto foo)
{
if constexpr(begin < end)
{
foo.template operator()<begin>();
unroll<begin + 1, end>(foo);
}
}
unroll<0,NUM_SPECIALIZATIONS>([&]<int NUM>()
{
jump_table[NUM] = print_num<NUM>;
});
There we go. Now the interpreter and the template specializations are decoupled.
Then when I want to call the corresponding template specialization to the value of a runtime variable my_num
, I can do:
jump_table[my_num](); // Almost like saying print_num<my_num>();
It's even possible to modify the jump table at runtime, simply by reassigning the contents of the array to a different function name:
jump_table[NUM] = /* a different function name */;
The downside of this approach is that, compared to a switch statement, a slight runtime penalty is incurred just from accessing the elements of the array. I think this is inherent due to the fact that switch statements generate their jump tables in instruction memory at compile time, whereas what I've done here is generate the jump table in data memory at runtime.
I suppose the slight runtime penalty doesn't matter very much as long as the functions have long enough execution times, at which point the overhead would be negligible by comparison.
print_num
in any other scenario than with values only known at runtime? If not, why not make it into a regular function (not a function template) and call it with an argument?void print_num(size_t x) { static const char* arr[] = {"Zero", "One", ... }; std::cout << arr[x] << '\n'; }
(+ added bounds checking if needed) – Motteostd::function
and its type-converting overhead to hold a simple pointer-to-function.typedef void (*fptr)(); std::array<fptr, NUM_SPECIALIZATIONS> jump_table;
. – Pinto