What comes after QWORD?
Asked Answered
T

2

8

If

  • 8 bits is a byte

  • two bytes is a word

  • four bytes is a dword

  • 8 bytes is a qword

What is a good name for 16 bytes?

Thane answered 22/9, 2016 at 17:13 Comment(3)
xword? can't really use "hexword", since that's kinda ambiguousOperator
octoword, and of course 4 bits is a nybbleCattery
16 bytes is a paragraph. I found it in the dos programming manual long ago. I have no idea why. But does anybody actually use that anymore?Woodruff
G
10

TL:DR: In NASM, after RESB/RESW/RESD/RESQ there's RESO, RESY, and RESZ. In instruction mnemonics and Intel terminology (used in manuals), O (oct) and DQ (double-quad) are both used. But DQWORD isn't used, only OWORD.

Disassemblers will use xmmword ptr [rsi] for memory operand explicit sizes in MASM or .intel_syntax GNU syntax. IIRC, there are no instructions where that size isn't already implied by the mnemonic and/or register.


Note that this question is x86-specific, and is about Intel's terminology. In most other ISAs (like ARM or MIPS), a "word" is 32 bits, but x86 terminology originated with 8086.

Terminology in instruction mnemonics

Octword is used in the mnemonics for some x86-64 instructions. e.g. CQO sign-extends rax into rdx:rax.

CMPXCHG16B is another non-vector instruction that operates on 16 bytes, but Intel doesn't use "oct" anywhere in the description. Instead, they describe the memory location as a m128. That manual entry doesn't use any "word"-based sizes.

SSE/AVX Integer instructions often have an element-size as part of the mnemonic. In that context, DQ (double-quad) is used, never O (oct). For example, the PUNPCKL* instructions that interleave elements from half of two source vectors into a full destination vector:

  • PUNPCKLWD: word->dword (16->32)
  • PUNPCKLDQ: dword->qword (32->64)
  • PUNPCKLQDQ: two qwords->full 128bit register (64->128).

However, it's only ever DQ, not DQWord. Double-Quadword sounds somewhat unnatural, but I think it might be used in Intel manuals occasionally. It sounds better if you leave out the "Word", and just say "Store a Double-Quad at this location". If you want to attach "word" to it, I think only OWord sounds natural.

There's also MOVDQA for load/store/reg-reg moves. Mercifully, when AVX extended the vector width to 256b, they kept the same mnemonics and didn't call the 256b version VMOVQQA.

Some instructions for manipulating the 128-bit lanes of 256-bit registers have a 128 in the name, like VEXTRACTF128, which is new for Intel (other than CMPXCHG8B).


Assembler directives:

From the NASM manual:

3.2.1 DB and Friends: Declaring Initialized Data

DB, DW, DD, DQ, DT, DO, DY and DZ are used ... (table of examples)

DO, DY and DZ do not accept numeric constants as operands.

DT is a ten-byte x87 float. DO is 16 bytes, DY is a YMMWORD (32 bytes), and DZ is 64 bytes (AVX512 ZMM). Since they don't support numeric constants as initializers, I guess you could only use them with string literal initalizers? It would be more normal anyway to DB/DW/DD/DQ with a comma-separated list of per-element initializers.

Similarly, you can reserve uninitialized space.

realarray       resq    10              ; array of ten reals 
ymmval:         resy    1               ; one YMM register 
zmmvals:        resz    32              ; 32 ZMM registers

Terminology in intrinsics, and AVX512

As I mentioned in my answer on How can Microsoft say the size of a word in WinAPI is 16 bits?, AVX512's per-element masking during other operations makes naming tricky. VSHUFF32x4 shuffles 128b elements, with masking at 32bit element granularity.

However, Intel is not backing away from word=16 bits. e.g. AVX512BW and AVX512DQ put that terminology right in the name. Some intrinsics even use them, where previous it was always epi32, not d. (i.e. _mm256_broadcastd_epi32(__m128i), _mm256_broadcastw_epi16(__m128i). The b/w/d/q is totally redundant. Maybe that was a mistake?)

(Does anyone else find the asm mnemonics easier to remember and type than the annoyingly-long intrinsics? You have to know the asm mnemonics to read compiler output, so it would be nice if the intrinsics just used the mnemonics instead of a second naming scheme.)

Grateful answered 23/9, 2016 at 1:55 Comment(1)
Re: why qword is 64-bit on x86: What's the size of a QWORD on a 64-bit machine?Grateful
R
4

I don't think that this is in common use (even though SSE/AVX have 128-bit and 256-bit operands, they're vectors of elements that are no more than qword size), but the obvious extension after "quad" would be octoword / oword.

Renwick answered 22/9, 2016 at 17:17 Comment(6)
now that I think of it it does get confusing past 64 bits. maybe mbyte(x)/mword(x)/mdword(x)/mqword(x) would reduce confusion of too many names. eg multi-primitive. I can imagine bugs from people having trouble remembering how many bytes are in oword+.Thane
@Dmitry I never liked the word/dword/qword thing in the first place, I think it's confusing for the same reason. Originally there was only "byte" and "word" and that was okay, but as soon as "dword" came along it was a bad system :)Renwick
@Renwick the thing is it took quite an era before dword emerged... ;)Bunkmate
1. In the beginning there was the Word, and Word had two Bytes and there was nothing else. 2. And God divided the ones from the zeros and saw that it was good. 3. And God said, Let there be data: and there were dataCavil
I don't find it confusing at all, unless I look at anything outside of Intel terminology. e.g. gdb's x command calls a 16-bit element a half-word. ARM even has d (double) registers that are 64-bit, and q registers that are 128-bit. Now that's confusing, but the reasons are obvious (of course an ARM word is 32-bit).Grateful
Anyway, it's just a simple letter -> size mapping, and if you remember the mnemonics for any vector instructions (pshufb / pshuflw / pshufd), you automatically remember that d just means 32-bit. I just think of it as meaning 32-bit, not as meaning two words that are separate in any way. BTW, AT&T syntax uses a different letter->size convention, where l->32 bits as an operand-size suffix. It stands for long, like the .long directive, because the syntax dates back to CPUs like m68k. (move.l vs. move.w)Grateful

© 2022 - 2024 — McMap. All rights reserved.