How to unroll a for loop using template metaprogramming
Asked Answered
P

1

6

How do you write a simple C++ code to simply run a for loop with a certain unrolling factor? For instance, I need to write a for loop that assigns a value of i to each index of the array, i.e, A[i]=i for array size lets say 1e6.

Now I want to add an unrolling factor of lets say 20. I do not want to manually write 20 lines of code and iterate it 5k times. How do I do this? Do I nest my for loop? Does the compiler automatically do some unrolling for me if I use template metaprogramming? And how do I manually set an unrolling factor (fixed at compile time ofcourse)?

Polyglot answered 22/10, 2017 at 6:4 Comment(4)
If you don't trust your compiler to do the best unrolling possible, perhaps find a better compiler.Grow
To reiterate n.m’s answer, don’t waste your time pretending to be a compiler. Your compiler knows how to best optimize your loops — including unrolling stuff. Save efforts to fix stuff for bottlenecks found while profiling.Touchy
You may find this useful:informit.com/articles/article.aspx?p=30667&seqNum=7Mcwhirter
In the past, the very same argument was made in other areas where you should "trust the compiler", such as compile-time evaluation. In a certain theoretical sense, 'constexpr' has never been needed. But people noticed the compiler basically not doing its job, and wanted verification. And rightfully so, the introduction of constexpr drove big changes necessary for the optimizer to finally do what people had been imagining. The reality of loop unrolling is in a similar state. It is not hard to trip up the loop unrolling on any of the major compilers. Of course you should actually measure though.Marvamarve
G
7

The following examples are written in C++17, but with some more verbose techniques the idea is applicable to C++11 and above.

If you really want to force some unrolling then consider std::make_integer_sequence and C++17's fold expressions:

#include <iostream>
#include <type_traits>
#include <utility>

namespace detail {

template<class T, T... inds, class F>
constexpr void loop(std::integer_sequence<T, inds...>, F&& f) {
  (f(std::integral_constant<T, inds>{}), ...);// C++17 fold expression
}

}// detail

template<class T, T count, class F>
constexpr void loop(F&& f) {
  detail::loop(std::make_integer_sequence<T, count>{}, std::forward<F>(f));
}

int main() {
  loop<int, 5>([] (auto i) {
    constexpr int it_is_even_constexpr = i;
    std::cout << it_is_even_constexpr << std::endl;
  });
}
Gharry answered 22/10, 2017 at 11:57 Comment(1)
How does fold expression here work? How std::integral_constant<T, inds>{} does convert to different index each iteration?Lizethlizette

© 2022 - 2024 — McMap. All rights reserved.