pow for SSE types
Asked Answered
I

4

9

I do some explicitly vectorised computations using SSE types, such as __m128 (defined in xmmintrin.h etc), but now I need to raise all elements of the vector to some (same) power, i.e. ideally I would want something like __m128 _mm_pow_ps(__m128, float), which unfortunately doesn't exist.

What is the best way around this? I could store the vector, call std::pow on each element, and then reload it. Is this the best I can do? How do compilers implement a call to std::pow when auto-vectorising code that otherwise is well vectorisable? Are there any libraries that provide something useful?

(note that this question is related by not a duplicate and certainly doesn't have a useful answer.)

Irfan answered 19/9, 2014 at 14:14 Comment(3)
I've used gruntthepeon.free.fr/ssemath for exp/log and write pow(x,k) as exp(k*log(x) when auto-vectorisation was not an option. Not sure how it compares with auto-vectorized code.Espalier
You could use Agner Fog's vector class. He has SIMD math functions (including pow, exp, log, sin,...) for SSE, AVX, and AVX512 for single and float and ints. I don't see any good reason to use Intel's SVML or AMD's libm anymore.Career
@Zboson, Is there a good C library for exp() with SSE4 support?Venereal
C
9

Use the formula exp(y*log(x)) for pow(x, y) and a library with SSE implementations of exp() and log().

Edit by @Royi: The above holds only for cases both x and y are positive. Otherwise more carefull Math is needed. See https://math.stackexchange.com/questions/2089690.

Colubrid answered 19/9, 2014 at 14:46 Comment(6)
I had a look at that library. It looks restricted to gcc, only knowns about SSE2, and the documentation in the code is poor. I also would want it for the AVX types __m256 __m256d`.Irfan
@Irfan works fine with MSVC (note benchmarks with VS2010 at the bottom of the link), and code becomes more clear when looking at cephes library which seems to be the main inspiration.Espalier
I need it work wit gcc, icc, clang. The original cephes library is great! If there is nothing better, I can at least implement my own log and exp along the lines of these libraries.Irfan
@Irfan Did you even try to compile the library I've linked? There are about 5 compiler specific lines that and they should work with all the compilers you've mentioned.Colubrid
Sorry, had other things to do in the last couple of days. But this library does not support all the vector types I need.Irfan
Using this formula, how does one handle negative numbers ?Dickson
B
2

I really recommend the Intel Short Vector Math Library for these types of operations. The library is bundled with the Intel compiler which you mention in the list of compilers to support. I doubt it would be useful for gcc and clang but it could serve as a reference point for benchmarking wherever pow implementation you come up with.

https://software.intel.com/sites/products/documentation/doclib/iss/2013/compiler/cpp-lin/GUID-DEB8B19C-E7A2-432A-85E4-D5648250188E.htm

Benedic answered 22/9, 2014 at 10:4 Comment(4)
SVML can be useful with gcc. gcc -mveclibabi=svml will even let the vectorizer create calls to vmlsPow4 and such.Rostock
@MarcGlisse, Does gcc include Intel SVML built in?Venereal
gcc does not include SVML, it only has the knowledge of how to generate calls to it, if you promise that you will have it available for linking.Rostock
Your link is no longer working.Kobe
H
2

An AVX version of the ssemath library is now available: http://software-lisc.fbk.eu/avx_mathfun/

with the library you can use:

exp256_ps(y*log256_ps(x)); // for pow(x, y)
Hernandes answered 29/3, 2017 at 11:12 Comment(2)
Yes, these provide the log, exp, sin, cos, and a sincos function for 8 floats using AVX. Unfortunately, the corresponding double versions are still outstanding (I actually needs those more at the moment).Irfan
You could try the Intel SPMD compiler: ispc.github.io/ispc.html the documentation says it supports pow and AVXHernandes
P
-2

Make a vector out of the float.

 _mm_pow_ps(v,_mm_ps1(f))
Polyneuritis answered 10/4, 2015 at 5:48 Comment(2)
There is no _mm_pow_ps(), I'm afraid. Otherwise, I had not asked.Irfan
Ah, I misunderstood the question. Taylor series is traditional, as noted earlier gruntthepeon.free.fr/ssemath is a good resource. Depending on how important accuracy is you can make the number of terms much lower.Sierrasiesser

© 2022 - 2024 — McMap. All rights reserved.