How to convert 'long long' (or __int64) to __m64
Asked Answered
S

1

8

What is the proper way to convert an __int64 value to an __m64 value for use with SSE?

Schellens answered 30/1, 2012 at 8:46 Comment(1)
For the googlers, can someone explain __int64 vs __m64? :-)Latialatices
Y
9

With gcc you can just use _mm_set_pi64x:

#include <mmintrin.h>

__int64 i = 0x123456LL; 
__m64 v = _mm_set_pi64x(i);

Note that not all compilers have _mm_set_pi64x defined in mmintrin.h. For gcc it's defined like this:

extern __inline __m64  __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_set_pi64x (long long __i)
{
  return (__m64) __i;
}

which suggests that you could probably just use a cast if you prefer, e.g.

__int64 i = 0x123456LL; 
__m64 v = (__m64)i;

Failing that, if you're stuck with an overly picky compiler such as Visual C/C++, as a last resort you can just use a union and implement your own intrinsic:

#ifdef _MSC_VER // if Visual C/C++
__inline __m64 _mm_set_pi64x (const __int64 i) {
    union {
        __int64 i;
        __m64 v;
    } u;

    u.i = i;
    return u.v;
}
#endif

Note that strictly speaking this is UB, since we are writing to one variant of a union and reading from another, but it should work in this instance.

Yttria answered 30/1, 2012 at 9:5 Comment(13)
Where did you get this version of mmintrin.h from ? What compiler are you using ? For current versions of gcc (4.x) __mm_set_pi64x is defined in mmintrin.h.Yttria
I'm using Visual Studio 2010... I got it from C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\mmintrin.h. Kinda confused...Schellens
You should probably tag your question appropriately if you're using a non-standard compiler. See updated answer above for alternative suggestion.Yttria
"Nonstandard" is in the eye of the beholder. It's pretty standard for Windows, which is also a pretty standard development environment... and it follows the C++ standard well enough for me (GCC isn't fantastic either). :) Anyway, using your second example, I get error C2440: 'type cast' : cannot convert from '__int64' to '__m64'.Schellens
Visual C/C++ is just about the worst compiler for SSE work - stuff that just works with gcc, ICC and other standard compilers often doesn't work with Microsoft compilers - you end up coding to the "lowest common denominator". I suggest that if you're stuck with Windows then you should at least switch to Intel's ICC compiler, which is a lot better in every regard (including performance of generated code).Yttria
If you have $1,899 to spare and buy me ICC, I'd gladly switch to it. ;)Schellens
Well if you're developing a commercial product then $1,899 is a very small investment which will more than pay for itself. If this is just for a personal project or free software though then you can use the union implementation above.Yttria
My experience with ICC has been mixed. Although it "generally" compiles faster code, I've seen numerous cases of it going brain-dead and getting beaten out by MSVC (by large margins).Floreated
@PaulR: It's indeed for a personal project. Btw, are you sure the union method works? I've tried doing something similar before and gotten myself into trouble with access violations and such...Schellens
@Mehrdad: so long as you don't have any alignment issues then the union method should work - I've had to use this method for similar workarounds with Visual C in the past.Yttria
@Mysticial: mostly I use gcc as a baseline when benchmarking SIMD code and ICC generally beats gcc, or at least matches it. I have to maintain compatibility with MSVC though so I do a little benchmarking from time to time, and MSVC-generated SIMD code is usually much slower - occasionally though MSVC will excel at some scalar code optimisations, I have to admit.Yttria
@PaulR From my experience: Prior to VS2010, ICC consistently beats MSVC on nearly all SSE code I write. Starting from VS2010, I have to admit that MSVC beats ICC more in more than half the cases I've done. A notorious example of ICC optimization fail is on my answer here. MSVC (and GCC with the right options) gets peak performance. ICC fails to get even 70%.Floreated
@Mysticial: thanks - that's useful information - due to product cycles etc I still don't use anything newer than VS 2008, but it sounds like it might be worth re-evaluating some of our SIMD benchmarks with VS 2010. I doubt they have fixed any of other other annoyances though (still no C99 support after > 10 years, unnecessary ABI restrictions, etc)Yttria

© 2022 - 2024 — McMap. All rights reserved.