I'm rewriting a code from AVX2 to AVX512.
What's the equivalent I can use to broadcast a single float number to a _mm512 vector? In AVX2 it is _mm256_broadcast_ss() but I can't find something like _mm512_broadcast_ss().
I'm rewriting a code from AVX2 to AVX512.
What's the equivalent I can use to broadcast a single float number to a _mm512 vector? In AVX2 it is _mm256_broadcast_ss() but I can't find something like _mm512_broadcast_ss().
AVX512 doesn't need a special intrinsic for the memory source version1. You can simply use _mm512_set1_ps
(which takes a float
, not a float*
). The compiler should use a memory-source broadcast if that's efficient. (Potentially even folded into a broadcast memory source for an ALU instruction instead of a separate load; AVX512 can do that for 512-bit vectors.)
https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_mm512_set1_ps&expand=5236,4980
Footnote 1: The reason for _mm256_broadcast_ss
even existing separately from _mm256_set1_ps
is probably because of AVX1 vbroadcastss ymm, [mem]
vs. AVX2 vbroadcastss ymm, xmm
. Some compilers like MSVC and ICC let you use intrinsics without enabling the ISA extensions for the compiler to use anywhere, so there needed to be an intrinsic for only the AVX1 memory-source version specifically.
With AVX512, both memory and register source forms were introduced with AVX512F so there's no need to give users of those compilers a way to micro-manage which asm is allowed.
_mm256_set1_ps( *ptr )
with AVX1 as well; I'm not sure why _mm256_broadcast_ss
even exists. Maybe because of some compilers like MSVC that never optimize intrinsics and don't let you avoid AVX2 instructions with command line options? So you can use _mm256_broadcast_ss
to make sure you get the AVX1 memory-source version, and _mm256_set1_ps
to also allow the AVX2 register source vbroadcastss ymm, xmm
version, whichever is convenient for the compiler? Anyway, fortunately AVX512 introduced both mem and reg source versions with the same extension. –
Bencher © 2022 - 2024 — McMap. All rights reserved.
_mm512_broadcastss_ps
– Witchy