(C++) Automatically generate switch statement cases at compile time

Asked 18/8, 2021 at 11:53 Answered 25/8, 2021 at 8:43

c++templates metaprogramming template-meta-programming

In my program I have some code which looks something like this: A collection of functions or classes which uniquely implement a common template via template specialization:

constexpr int NUM_SPECIALIZATIONS = 32;

template<int num>
void print_num(){}

template<>
void print_num<0>(){cout << "Zero" << endl;}

template<>
void print_num<1>(){cout << "One" << endl;}

template<>
void print_num<2>(){cout << "Two" << endl;}

// etc, ...

template<>
void print_num<31>(){cout << "Thirty-One" << endl;}

Then I have a variable whose value is only known at run time:

int my_num;
cin >> my_num; // Could be 0, could be 2, could be 27, who tf knows

Then I need to call the template specialization which corresponds to the value of the variable. Since I can't use a variable as a template parameter, I would create a sort of "interpreter":

switch(my_num)
{
  case 0:
  print_num<0>();
  break;
  case 1:
  print_num<1>();
  break;
  case 2:
  print_num<2>();
  break;
  // etc, ...
  case 31:
  print_num<31>();
  break;
}

The first thing I notice about this code is that it is repetitive. Surely there must be some sort of trick to procedurally generate this code.

The other thing I notice is that it is inconvenient to maintain, since it is coupled with the template specializations; Each time I want to add a new template specialization, I need to update the interpreter as well.

Ideally, I would be able to use some sort of template magic to generate the interpreter automatically at compile time so that both sections of code can remain decoupled while I maintain the efficiency of the switch statement:

// Copies and pastes the code found in template lambda "foo",
// Replacing all occurrences of its template parameter with values from
// "begin" until "end"
template<auto begin, auto end>
inline void unroll(auto foo)
{
  if constexpr(begin < end)
  {
    foo.template operator()<begin>();
    unroll<begin + 1, end>(foo);
  }
}

// A template lambda which generates a generic switch case for the interpreter
auto template_lambda = [&]<int NUM>()
{
  case NUM:
  print_num<NUM>();
  break;
};

// The interpreter; contains the code "case NUM: print_num<NUM>(); break;"
// repeated for all ints NUM such that 0 <= NUM < NUM_SPECIALIZATIONS
switch(my_num)
{
  unroll<0,NUM_SPECIALIZATIONS>(template_lambda);
}

Unfortunately this code doesn't compile. It never makes it past the syntax checker, since the "case" and "break" statements in my lambda function technically aren't inside the switch statement yet.

In order for this to work, I need to implement the "unroll" function using macros rather than templates and lambdas, so that the copying and pasting of source code happens before syntax checking rather than after.

An alternate solution which I've played around with is to mimic what the switch statement does on the low level. I can create an array of function pointers which serves as a jump table:

std::array<std::function<void()>,NUM_SPECIALIZATIONS> jump_table;

Then I can populate the jump table with pointers to the various template specializations without necessarily having to type them all out. Instead I can just use the unroll function:

template<auto begin, auto end>
inline void unroll(auto foo)
{
  if constexpr(begin < end)
  {
    foo.template operator()<begin>();
    unroll<begin + 1, end>(foo);
  }
}

unroll<0,NUM_SPECIALIZATIONS>([&]<int NUM>()
{
  jump_table[NUM] = print_num<NUM>;
});

There we go. Now the interpreter and the template specializations are decoupled.

Then when I want to call the corresponding template specialization to the value of a runtime variable my_num, I can do:

jump_table[my_num](); // Almost like saying print_num<my_num>();

It's even possible to modify the jump table at runtime, simply by reassigning the contents of the array to a different function name:

jump_table[NUM] = /* a different function name */;

The downside of this approach is that, compared to a switch statement, a slight runtime penalty is incurred just from accessing the elements of the array. I think this is inherent due to the fact that switch statements generate their jump tables in instruction memory at compile time, whereas what I've done here is generate the jump table in data memory at runtime.

I suppose the slight runtime penalty doesn't matter very much as long as the functions have long enough execution times, at which point the overhead would be negligible by comparison.

Eluvium answered 18/8, 2021 at 11:53 Comment(4)

Are you using print_num in any other scenario than with values only known at runtime? If not, why not make it into a regular function (not a function template) and call it with an argument? void print_num(size_t x) { static const char* arr[] = {"Zero", "One", ... }; std::cout << arr[x] << '\n'; } (+ added bounds checking if needed) – Motteo 18/8, 2021 at 11:58

this is somewhat related #2157662 and more details on the same here #2157649. I don't think there is a way around macros unless you want to change the approach as Ted suggested – Quar 18/8, 2021 at 12:21

Yes, an array of function pointers is probably a good approach. You don't need std::function and its type-converting overhead to hold a simple pointer-to-function. typedef void (*fptr)(); std::array<fptr, NUM_SPECIALIZATIONS> jump_table;. – Pinto 18/8, 2021 at 12:45

@Ted Lyngmo This is because in my project, each specialization is VERY unique. The example I've shown is simplified, but in my project the behavior of each function and the data it has access to is going to be wildly different and therefore non-generalizable. – Eluvium 18/8, 2021 at 16:59

You could do it like this:

namespace detail {
    template<size_t I>
    void run(size_t idx) {
        if (I == idx) {
            print_num<I>();
        } 
        else if constexpr (I+1 < NUM_SPECIALIZATIONS) {
            run<I+1>(idx);
        }
    }
}

void run(size_t idx) {
    detail::run<0>(idx);
}

Then call it like so:

int my_num;
cin >> my_num;
if (my_num >= 0)
   run(my_num);

Compilation times might suffer, however, due to possibly deep recursions, depending on the number of specializations.

Handicraft answered 18/8, 2021 at 13:3 Comment(8)

Would the compiler be guaranteed to simplify that chain of if-statements into a single switch statement? – Eluvium 18/8, 2021 at 17:12

@LoganSchlick No. It'll be a recursive call and some optimization may remove the recursion. – Motteo 18/8, 2021 at 17:17

Actually I looked at the assembly instructions and it does compile into a big chain of if statements. Even with compiler optimizations on. Meaning this solution has O(n) complexity with the number of cases. Also you can't place else before the if statement because otherwise the synax checker will get angry. – Eluvium 18/8, 2021 at 17:28

@LoganSchlick I think it's because of the optimization. Different compilers may do it differently though. What do you mean about the syntax checker? It looks ok here – Motteo 18/8, 2021 at 17:31

Good point about the else. I've edited my answer :-) – Banbury 18/8, 2021 at 17:36

@Ted Lyngmo Ah I misread your comment and I thought you meant to put the else before the first if. My bad. But I'm looking at the assembly code and it still looks like a big chain of if statements. Is there any compiler optimization which would convert that into a switch statement? (I've only tested it with no optimizations and -O3 optimizations) – Eluvium 18/8, 2021 at 17:41

@LoganSchlick "it still looks like a big chain of if statements" - That may be. I'd expect a switch to look very similar. You could of course make an array of function pointers like this if you want something that looks very different - but I'd measure the performance before selecting one over the other. – Motteo 18/8, 2021 at 19:12

@LoganSchlick Another take on Matthias' nice solution would be to make it into a binary search like this. The average number of comparisons (< 1.5 per branch) could then be reduced a lot compared to having to search sequentially from start until the number is found if you have a lot of specializations, – Motteo 18/8, 2021 at 20:52

In addition to another answer, let me post a Boost.Mp11-based solution, which is a one-liner:

std::size_t my_num;
std::cin >> my_num;

boost::mp11::mp_with_index<NUM_SPECIALIZATIONS>(
    my_num, [](auto index) { print_num<index>(); });

Here index has type std::integral_constant<std::size_t, i>, which is implicitly convertible into std::size_t. The conversion operator is constexpr.

Violate answered 18/8, 2021 at 13:35 Comment(0)

Non-recursive version

template <class T, class F, T... I>
bool to_const(T value, F&& fn, std::integer_sequence<T, I...>) {
    return (
        (value == I && (fn(std::integral_constant<T, I>{}), true))
        || ... // or continue
    );
}

template <std::size_t Size, class T, class F>
bool to_const(T value, F&& fn) {
    return to_const(value, fn, std::make_integer_sequence<T, Size>{});
}

And the usage

int my_num;
cin >> my_num;

bool found = to_const<NUM_SPECIALIZATIONS>(my_num, [](auto I)
{
    print_num<I>();
});

GCC 11 is able to optimize away comparison in to_const() completely, and build a jump table using input index to select right method.

jmp     [QWORD PTR .L4[0+rax*8]]

Where rax is my_num and .L4 is jump table.

See result here

The same optimization, but already from GCC version 5, can be achieved by straight forward implementation of jump table.

int my_num;
cin >> my_num;

auto print_jump_table = []<int... I>(std::integer_sequence<int, I...>) {
    static constexpr void(*jump_table[])() = { print_num<I>... };
    return jump_table;
}(std::make_integer_sequence<int, NUM_SPECIALIZATIONS>{});

if (unsigned(my_num) < NUM_SPECIALIZATIONS) {
    print_jump_table[my_num]();
}

See result here

Thorndike answered 25/8, 2021 at 8:43 Comment(0)

Recommended topics

Hot tags