Assembly syntax for masked vector Intel AVX-512 instructions
Asked Answered
C

2

7

For testing purposes, I am writing short assembly snippets for Intel's Xeon Phi with the Icc inline assembler. Now I wanted to use masked vector instructions, but I fail at feeding them to the inline assembler.

For code like this:

vmovapd  -64(%%r14, %%r10), %%zmm0{%%k1} 

I get the error message

/tmp/icpc5115IWas_.s: Assembler messages:
/tmp/icpc5115IWas_.s:563: Error: junk `%k1' after register

I tried a lot of different combinations, but nothing worked. The compiler version is intel64/13.1up03 under Linux, using GAS syntax.

Edit: The code above actually works with non-extended assembler. So this:

__asm__("vmovapd  -64(%r14, %r10), %zmm0{%k1} ")

works, while the following does not:

__asm__("vmovapd  -64(%[src], %%r10), %%zmm0{%%k1} "
    :
    : [src]"r"(src)
    :)

I guess it has something to do with the necessity to use a double % before register names in extended mode. But no, a single % for the k does not work either.

Contrabassoon answered 9/1, 2014 at 22:16 Comment(0)
C
6

I asked the same question in the Intel Developer zone http://software.intel.com/en-us/forums/topic/499145#comment-1776563, the answer is, that in order to use the mask registers on the Xeon Phi in extended inline assembler, you have to use double curly braces around the mask register modifier.

vmovapd     %%zmm30,         (%%r15,    %%r10){{%%k1}}
Contrabassoon answered 13/1, 2014 at 11:32 Comment(2)
Regular curly braces in GNU C inline asm are for syntax dialect alternatives, like add {%0, %1 | %1, %0} to write code that works with either AT&T or Intel, so you can compile it with or without -masm=intel.Einhorn
Also, the recommended way is to escape the { as %{, e.g. "... %{%%k1%} \n"Einhorn
S
0

I think you need to use the masked variant of the instruction: VMASKMOVPD

Saintsimon answered 10/1, 2014 at 14:28 Comment(3)
VMASKMOVPD is only for AVX, and not for KNI. They have not included it, because there is the universal vector lane masking functionality.Contrabassoon
I don't understand what you mean. vmovapd and vmaskmovpd are both AVX512 instruction. I don't know what KNI is in this context - the only Intel use of this TLA with which I'm familiar is Kernel NIC Interface.Saintsimon
KNI are Knights Corner New Instructions, the vector instruction set of the Xeon Phi. AVX512 is pretty similar, and both instruction sets will probably converge in the future.Contrabassoon

© 2022 - 2024 — McMap. All rights reserved.