Wrapper for `__m256` Producing Segmentation Fault with Constructor - Windows 64 + MinGW + AVX Issues
Asked Answered
W

1

6

I have a union that looks like this

 union bareVec8f { 
    __m256 m256; //avx 8x float vector
    float floats[8];
    int ints[8];
    inline bareVec8f(){
    }
    inline bareVec8f(__m256 vec){
        this->m256 = vec;
    }
    inline bareVec8f &operator=(__m256 m256) {
        this->m256 = m256;
        return *this;
    }

    inline operator __m256 &() {
        return m256;
    }
}

the __m256 needs to be aligned on 32 byte boundary to be used with SSE functions, and should be automatically, even within the union.

And when I do this

bareVec8f test = _mm256_set1_ps(1.0f);

I get a segmentation fault. This code should work because of the constructor I made. However, when I do this

bareVec8f test;
test.m256 = _mm256_set1_ps(8.f);

I do not get a segmentation fault.

So because that works fine the union is probably aligned properly, there's just some segmentation fault being caused with the constructor it seems

I'm using gcc 64bit windows compiler

---------------------------------EDIT Matt managed to produce the simplest example of the error that seems to be happening here.

#include <immintrin.h>

void foo(__m256 x) {}

int main()
{
    __m256 r = _mm256_set1_ps(0.0f);
    foo(r);
}

I'm compiling with -std=c++11 -mavx

Washstand answered 18/6, 2015 at 21:38 Comment(9)
You can test the alignment hypothesis quite easily: auto ptrvalue = (std::uintptr_t)this;.Profanity
You didn't make a copy-constructor. baseVec8f(__m256 vec) is a constructor. The copy-constructor has signature baseVec8f(baseVec8f &) possibly with const. bareVec8f test = _mm256_set1_ps(1.0f); uses the compiler-generated copy constructor. (although it would probably be elided in this case)Antlia
@MattMcNabb I left out 2 methods of the union, which I included now in the post, is the error recreatable now?.Washstand
@MattMcNabb my mcve good enough?Washstand
@MattMcNabb yes errors on test and test2Washstand
I reproduce the behaviour by changing avxintrin.h to immintrin.h and adding a semicolon after the union definition (using mingw-w64 4.9.2), and compiling with g++ -o foo foo.cc -mavx (error goes away for some other compiler switches)Antlia
to make the example I had to replace avxintrin.h with immintrin.h, I don't know why that's necessary but doesn't matter. I edited the example to something that compiles.Washstand
and the example I just made I replaced with Matt's more simple exampleWashstand
Related - stackoverflow.com/questions/30928265.Kesley
A
7

This is a bug in g++ for Windows. It does not perform 32-byte stack alignment when it should. Bug 49001 Bug 54412


On this SO thread someone made a Python script to process the assembly output by g++ to fix the problem, so that would be one option.

Otherwise, to avoid this in your union you could make the functions which take __m256 by value, take it by reference instead. This shouldn't have any performance penalty unless optimization is low/off.

In case you are unaware - union aliasing causes undefined behaviour in C++, it's not permitted to write m256 and then read floats or ints for example. So perhaps there is a different solution to your problem.

Antlia answered 18/6, 2015 at 23:50 Comment(4)
Yes confirmed, it produces seg faultWashstand
MSVC does not pass SSE and AVX registers by value (but GCC, ICC, and CLANG do). So I always pass them as const references.Jago
Has anyone solved this ever since? What if I just apply AVX function (Intrinsics) on an array (Properly aligned) will I have issues?Kesley
@Royi: If you compile with optimization enabled so __m256 local vars aren't spilled/reloaded to the stack, then you will probably be ok. But you won't reliably be able to compile with -O0.Obbligato

© 2022 - 2024 — McMap. All rights reserved.