Intel AVX-512: how to set the EVEX.z bit - McMap

About

Intel AVX-512: how to set the EVEX.z bit

Asked 20/3, 2020 at 16:52 Answered 20/3, 2020 at 17:25

Solved assembly x86 machine-code avx512

I

1

6

The EVEX.z bit is used in AVX-512 in conjunction with the k registers to control masking. If the z bit is 0, it's merge-masking and if the z bit is 1 the zero elements in the k register are zeroed in the output.

The syntax looks like this:

VPSUBQ zmm0{k2}{z},zmm1,zmm2

where {z} represents the z bit.

But how do you set or test the EVEX.z bit? I've searched every resource I can find but I haven't found an answer.

Ixion answered 20/3, 2020 at 16:52 Comment(3)

I think it's a bit of the opcode, not part of the CPU state. – Whoredom 20/3, 2020 at 17:9

Does that mean I can't set it -- in other words if I execute the VPSUBQ instruction I show above there is no flexibility on the z bit? – Ixion 20/3, 2020 at 17:14

That is correct. The z bit will always be set as specified in the instruction. – Fiddlewood 20/3, 2020 at 20:51

W

6

As I understand it, what they mean is that VPSUBQ zmm0{k2}{z},zmm1,zmm2 and
VPSUBQ zmm0{k2},zmm1,zmm2 are two different instructions, whose encoding differs in a single bit, called the "z bit". (It's specifically part of the EVEX prefix to the instruction. Wikipedia documents all the fields)

So you "set the z bit" by specifying {z} in your assembler source, telling the assembler to generate an instruction with the corresponding bit set. This is documented lots of places, like Intel's vol.2 instruction set manual, and somewhat in Intel's intrinsics guide with mask (merge-masking) vs. maskz (zero-masking) versions of most intrinsics)

It is not a physical bit in the CPU state like the direction flag or something, that would persist from one instruction to the next. It doesn't make sense to "test" it.

To illustrate, here's what I get by assembling both versions:

00000000  62F1F5CAFBC2      vpsubq zmm0{k2}{z},zmm1,zmm2
00000006  62F1F54AFBC2      vpsubq zmm0{k2},zmm1,zmm2

Note the encodings differ in the high bit of the fourth byte. That's your "z bit".

Maybe you were thinking that you could "set" or "clear" the z bit at runtime, thus changing the masking effect of subsequent instructions? Since it's part of the encoding of each instruction, not the CPU state, that way of thinking only works if you were JITing the instructions on the fly or using self-modifying code.

In "normal" ahead-of-time code, you'll have to write the code in both versions, once with {z} instructions and once without. Use a conditional jump to decide which version to execute.

Whoredom answered 20/3, 2020 at 17:25 Comment(4)

We're not supposed to say thank you on SO but your answer expands the knowledge base because it's the only explanation I know of about how it's set. So thanks. – Ixion 20/3, 2020 at 17:35

@RTC222: AVX512 EVEX prefix encoding / bits are fully documented in Intel's vol.2 manual. And being able to use merge-masking vs. zero-masking (and rounding-mode overrides) is mentioned in lots of places, including Intel's intrinsics guide. (Although that guide's example of ASM syntax just uses {z} instead of {k1}{z}, which is weird even though it's documenting intrinsics not asm.) Anyway, this answer nicely puts the pieces together. – Encouragement 20/3, 2020 at 20:12

I found verbal descriptions in section 15.1.4 of Vol. 1 and 3.1.1.3 of Vol. 2A of the Combined Volumes Oct 2019, but neither one explained that you don't set the z bit, you eliminate it from the instruction where needed. Nate Eldredge's answer clarifies it very well. – Ixion 20/3, 2020 at 22:1

The syntax of decorators proposed by Intel and maintained by NASM is not documented enough. It doesn't specify if the decorators are case insensitive {Z}, whether they can be swapped {z}{k2}, where exactly they may be put in the operand list... That's why other assemblers use alternative syntax: euroassembler.eu/eadoc/#ZEROINGeq – Bruni 21/3, 2020 at 9:46

Recommended topics

#Godot #Unity #Godot 4.X #Mongodb

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

© 2022 - 2024 — McMap. All rights reserved.