Why does vectorization behave differently for almost the same code? - McMap

About

Why does vectorization behave differently for almost the same code?

Asked 8/5, 2015 at 21:21 Answered 2/3, 2018 at 16:15

c++visual-studio vectorization sse auto-vectorization

N

1

13

Here are free functions that do the same but in the first case the loop is not vectorized but in the other cases it is. Why is that?

#include <vector>

typedef std::vector<double> Vec;

void update(Vec& a, const Vec& b, double gamma) {
    const size_t K = a.size();
    for (size_t i = 0; i < K; ++i) { // not vectorized
        a[i] = b[i] * gamma - a[i];
    }
}

void update2(Vec& a, const Vec& b, double gamma) {
    for (size_t i = 0; i < a.size(); ++i) { // vectorized
        a[i] = b[i] * gamma - a[i];
    }
}

void update3(Vec& a, size_t K, const Vec& b, double gamma) {
    for (size_t i = 0; i < K; ++i) { // vectorized
        a[i] = b[i] * gamma - a[i];
    }
}

int main(int argc, const char* argv[]) {
    Vec a(argc), b;
    update(a, b, 0.5);
    update2(a, b, 0.5);
    update3(a, a.size(), b, 0.5);
    return 0;
}

Relevant messages from the compiler (VS2013):

1>  c:\home\dima\trws\trw_s-v1.3\trws\test\vector.cpp(7) : info C5002: loop not vectorized due to reason '1200'
1>  c:\home\dima\trws\trw_s-v1.3\trws\test\vector.cpp(13) : info C5001: loop vectorized
1>  c:\home\dima\trws\trw_s-v1.3\trws\test\vector.cpp(19) : info C5001: loop vectorized

From comment by @tony

Reason 1200: "Loop contains loop-carried data dependences that prevent vectorization. Different iterations of the loop interfere with each other such that vectorizing the loop would produce wrong answers, and the auto-vectorizer cannot prove to itself that there are no such data dependences." source

Nikaniki answered 8/5, 2015 at 21:21 Comment(9)

Try a different compiler? Others (gcc and clang) do vectorize all 3 functions. – Isiah 8/5, 2015 at 21:35

What is "reason 1200" documented as? – Psychosomatic 8/5, 2015 at 22:5

Reason 1200: "Loop contains loop-carried data dependences that prevent vectorization. Different iterations of the loop interfere with each other such that vectorizing the loop would produce wrong answers, and the auto-vectorizer cannot prove to itself that there are no such data dependences." source – Roanna 8/5, 2015 at 22:16

I'm surprised that any compiler is able to vectorize any of those at all. The compiler needs to prove that either a and b do not alias or that &a[0] < &b[0]. – Stouthearted 9/5, 2015 at 7:52

@Stouthearted Or much more easily, the compiler can generate both the sequential and vectorized versions, and have a runtime test checking if a and b overlap to dispatch to either version. – Isiah 10/5, 2015 at 8:11

@Mysticial, from sum-of-overlapping-arrays-auto-vectorization-and-restrict GCC builds a version with and without overalap (which it can vectorize) and then when calling the function it checks for overlap first. Using restrict does not build the version with overlap and removes the check. – Ethology 11/5, 2015 at 7:28

All three loops seems sun-optimal to me because compiler assumes a and b aliases. For best performance, use restricted pointers to access data elements instead. – Holbrook 13/8, 2015 at 13:36

@Holbrook c++ does not have restrict – Strasser 13/8, 2015 at 14:5

@Strasser but MSVC have restricted keyword, although ideally a C99 implementation and a C++ wrapper is the best. – Holbrook 13/8, 2015 at 14:23

S

2

I guess it's some deeply internal compiler implementation issue, like at what stage did auto-vectorizer "kick in" and what's the state of the internal representation of the code at that time. When I tried on MSVC2017, it worked more in line with what one would expect. It auto-vectorized update() and update3(), but not update2(), with the reason given 501 for line 14, which is documented as:

Induction variable is not local; or upper bound is not loop-invariant.

Surf answered 2/3, 2018 at 16:15 Comment(0)

Recommended topics

#Godot #Unity #Godot 4.X #Mongodb

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

© 2022 - 2025 — McMap. All rights reserved.