I often use compiler-based vectorization, e.g., for AVX. I am trying to come up with a cleaner way without relying on compiler-based extensions (such as Intel's #pragma vector aligned
) by relying on C++11 alignment features. If you consider the code below, e.g., aligned::array<double,48> my_array;
allows me to declare an array in stack with proper alignment, and if it is used in the same translation unit compilers seem to recognize this.
My question now concerns how to declare a function with aligned parameters. My most successful attempt is, e.g., aligned::ptr<double>
, as used in the function f()
below.
gcc
compiles this without warnings (use -std=c++0x -O3
), and the loop is vectorized. Intel's icc
however, gives a warning and does not vectorize properly (warning #3463: alignas does not apply here; using type alignas(64) = T;
).
Who is correct? Is there something wrong with my usage of alignas? Is there a better way to accomplish this?
namespace aligned {
template <class T, int N>
using array alignas(64) = T[N];
template <class T>
using type alignas(64) = T;
template <class T>
using ptr = type<T> *;
}
#ifdef __ICC
#define IVDEP "ivdep"
#else
#define IVDEP "GCC ivdep"
#endif
void f(aligned::ptr<double> x, const aligned::ptr<double> y) {
_Pragma(IVDEP)
for(int i=0; i<4; i++)
x[i] = x[i]*y[i];
}
array = alignas(64) T[N]
? – Substitute