Someone better versed in this stuff can improve my answer.
The gist of my solution is that you declare N dimensions, with a start and an end.
It recurses on N-1 dimensions with the same start and end.
When it reaches the 1st dimension, it will actually begin incrementing the start, calling the passed function.
It will always attempt to pass a number of arguments identical to the number of dimensions (their indices).
So a call like this:
meta_for<2, 0, 2>::loop(
[](size_t i, size_t j)
{
std::cout << i << " " << j << std::endl;
});
Will result in output like this:
0 0
0 1
1 0
1 1
Here's the meta_for
structure, which uses a helper, iterate
:
template<size_t D, size_t B, size_t E>
struct meta_for
{
template<typename Func>
static void loop(Func&& func)
{
iterate<D, B, B, E>::apply(std::forward<Func>(func));
}
};
And the helpers:
// a helper macro to avoid repeating myself too much
#define FN template<typename Func, typename... Args> \
static void apply(Func&& func, Args&&... a)
// Outer loop. S="Self" or "Start". Indicating current index of outer loop. Intent is to iterate until S == E
template<int Dim, size_t S, size_t B, size_t E>
struct iterate
{
static_assert(S < E && B < E, "Indices are wrong");
FN
{
// outer loop recursive case. Recurse on lower Dimension (Dim-1), and then increment outer loop (S+1)
iterate<Dim-1, B, B, E>::apply (func, a..., S);
iterate<Dim, S+1, B, E>::apply (func, a...);
}
};
// Outer loop base case
template<int Dim, size_t B, size_t E>
struct iterate<Dim, E, B, E>
{
FN
{
// outer loop base case, End == End. Terminate loop
}
};
// innter loop. "S" is outer loop's current index, which we need to pass on to function
// "B" is inner loop's (this loop) current index, which needs to iterate until B == E
template<size_t S, size_t B, size_t E>
struct iterate<1, S, B, E>
{
static_assert(S < E && B < E, "Indices are wrong");
FN
{
// inner loop recursive case. Perform work, and then recurse on next index (B+1)
func(a..., B);
iterate<1, S, B+1, E>::apply(func, a...);
}
};
// inner loop base case
template<size_t S, size_t E>
struct iterate<1, S, E, E>
{
FN
{
// inner loop base case, End == End. Terminate loop
}
};
// case where zero dimensions (no loop)
template<size_t S, size_t B, size_t E>
struct iterate<0, S, B, E>
{
static_assert(sizeof(S) == 0, "Need more than 0 dimensions!");
};
More explanation
This solution, like any other involving variadic templates, relies on recursion.
I wanted to express recursion on an outer loop, so I started with a base case; the end of the loop. This is the case where the start is the same as the end :
template<int Dim, size_t B, size_t E>
struct iterate<Dim, E, B, E>
{ /*..*/};
Notice here that this is a specialization for <Dim, E, B, E>
. The second position indicates the outer loop's current index, and the last position indicates the index to iterate up to (but not including). So in this case, the current index is the same as the last, indicating we are finished looping (and hence a "do nothing" function).
The recursive case for the outer loop involves the scenario where the loop index is less than the index to iterate to. In template terms, the second position is less than the fourth position:
template<int Dim, size_t S, size_t B, size_t E>
struct iterate
{/*...*/}
Notice this is NOT a specialization.
The logic of this function is that an outer loop should signal an inner loop to begin executing from its start, and then the outer loop continues and starts the process all over again for inner loops:
iterate<Dim-1, B, B, E>::apply (func, a..., S);
iterate<Dim, S+1, B, E>::apply (func, a...);
Notice in the first line that the second template argument is again B
, indicating to start at the beginning again. This is necessary because the other recursive case on the second line increments S
(incrementing outer loop index).
The entire time, we are also accumulating arguments to pass to the function:
::apply(func, a..., S)
is passing the function on, along with indices of higher dimension loops, and then appending the current loop's index (S
). a
here is a variadic template.
The inner loop
When I say "inner loop", I mean the innermost loop. This loop needs to simply increment until the start index reaches the end index, and not attempt to recurse on any lower dimension. In our case, this is when our Dim
(Dimension) parameter is 1:
template<size_t S, size_t B, size_t E>
struct iterate<1, S, B, E>
{/*...*/};
At this point, we finally want to call our passed function, along with all arguments we've accumulated thus far (the indices of the outer loops) PLUS, the index of the innermost loop:
func(a..., B);
And then recurse (increment index)
iterate<1, S, B+1, E>::apply(func, a...);
The base case here is when the innermost loop's index is the same as the end index (AND the dimension is 1):
template<size_t S, size_t E>
struct iterate<1, S, E, E>
{/*...*/};
Hence the "do nothing" function here; there shouldn't be any work performed because the loop is terminating.
Finally, I included one last specialization to catch a user error where they didn't specify any dimensions:
template<size_t S, size_t B, size_t E>
struct iterate<0, S, B, E>
Which uses static_assert
to always fail because sizeof(size_t)
is not zero:
static_assert(sizeof(S) == 0, "Need more than 0 dimensions!");
Conclusion
This is a specific use case template meta-program. Where we essentially generate N nested for loops that all have the same start and end indices AND we want to pass those indices to a function. We could do a little more work to make it such that the iterate
structure could stand on its own without making the assumption that the outer loop's start and end indices are the same as an inner loop's.
My favorite application of this code is that we can use it to make an N-dimensional counter. For example, a binary counter for N-bits (found in the live demo).
std::forward<Lambda>
overstatic_cast<Lambda&&>
– Bose0..10*10*10
and recompute with modulo each part. See Completely enumerate indices of D-dimensional array at compile time – BoseN
nested loops, I just implement cartesian product of an arbitrary number of type lists, and then have a single loop over the cartesian product result. For run-time code, this will require a lot more memory, and it's not a great idea. But for compile-time code, I think it's not asymptotically different since at compile-time, you need to instantiate a new type / function / something for each iteration of any looping computation anyways. And here, you are outputting a structure of that size anyways. – Armenian