CS:APP example uses idivq with two operands?
Asked Answered
S

3

12

I am reading about x86-64 (and assembly in general) through the book "computer systems a programmer's perspective"(3rd edition). The author, in compliance with other sources from the web, states that idivq takes one operand only - just as this one claims. But then, the author, some chapters later, gives an example with the instruction idivq $9, %rcx.

Two operands? I first thought this was a mistake but it happens a lot in the book from there.

Also, the dividend should be given from the quantity in registers %rdx (high-order 64 bits) and %rax (low-order 64 bits) - so if this is defined in the architecture then it does not seem possible that the second operand could be a specified dividend.


Here is an example of an exercise (too lazy to write it all down - so a picture is the way to go). It claims that GCC emits idivq $9, %rcx when compiling a short C function.

Sugihara answered 18/9, 2019 at 18:38 Comment(11)
That's a mistake. Only imul has multi-operand and immediate forms, not mul, div, or idiv.Gronseth
No. Technically it could be a macro that initializes rdx:rax, not likely. Books like this always have a big, but incomplete errata.Alysaalyse
From the errata provided by @HansPassant (if you don't like to click links): "p. 199, third full paragraph. The reference to instruction idivl should be to idivq instead."Karleenkarlen
Wow perfect. Thank you guys! But @PeterCordes if it is a mistake then it seems weird that it occurs many times in the book :/Sugihara
The errata solved it for me btwSugihara
@JL2210: How does that help? Operand-size isn't the problem here, and this is already using idivq. Or are you just pointing out that the linked errata fails to mention this problem.Gronseth
Finding a mistake in a book is great progress!Scathing
@PeterCordes Hmm... I guess you're right. It was the only thing that my regex matched on the webpage. Guess I should've looked closer.Karleenkarlen
@PeterCordes ahh yea - right. My problem is of course still not solved unless it is a reoccurring mistake in the book.Sugihara
@NichlasUden What is the example meant to do?Karleenkarlen
CS:APP apparently has many serious AT&T syntax mistakes. e.g. I correctly guessed missing part of a function, but gcc generated assembly code doesn't match the answer is another one where it suggests that (%rbx, %rdi, %rsi) and (%rsi, %rsi, 9) are valid addressing modes! (The scale factor is actually a 2-bit shift count so these are total garbage and a sign of a serious lack of understanding by the authors, not a typo.)Gronseth
G
13

That's a mistake. Only imul has immediate and 2-register forms.

mul, div, or idiv still only exist in the one-operand form introduced with 8086, using RDX:RAX as the implicit double-width operand for output (and input for division).

Or EDX:EAX, DX:AX, or AH:AL, depending on operand-size of course. Consult an ISA reference like Intel's manual, not this book! https://www.felixcloutier.com/x86/idiv

Also see When and why do we sign extend and use cdq with mul/div? and Why should EDX be 0 before using the DIV instruction?

x86-64's only hardware division instructions are idiv and div. 64-bit mode removed aam, which does 8-bit division by an immediate. (Dividing in Assembler x86 and Displaying Time in Assembly has an example of using aam in 16-bit mode).

Of course for division by constants idiv and div (and aam) are very inefficient. Use shifts for powers of 2, or a multiplicative inverse otherwise, unless you're optimizing for code-size instead of performance.


CS:APP 3e Global Edition apparently has multiple serious x86-64 instruction-set mistakes like this in practice problems, claiming that GCC emits impossible instructions. Not just typos or subtle mistakes, but misleading nonsense that's very obviously wrong to people familiar with the x86-64 instruction set. It's not just a syntax mistake, it's trying to use instructions that aren't encodeable (no syntax can exist to express them, other than a macro that expands to multiple instructions. Defining idivq as a pseudo-instruction using a macro would be pretty weird).

e.g. I correctly guessed missing part of a function, but gcc generated assembly code doesn't match the answer is another one where it suggests that (%rbx, %rdi, %rsi) and (%rsi, %rsi, 9) are valid addressing modes! The scale factor is actually a 2-bit shift count so these are total garbage and a sign of a serious lack of knowledge by the authors about the ISA they're teaching, not a typo.

Their code won't assemble with any AT&T syntax assembler.

Also What does this x86-64 addq instruction mean, which only have one operand? (From CSAPP book 3rd Edition) is another example, where they have a nonsensical addq %eax instead of inc %rdx, and a mismatched operand-size in a mov store.


It seems that they're just making stuff up and claiming it was emitted by GCC. IDK if they start with real GCC output and edit it into what they think is a better example, or actually write it by hand from scratch without testing it.

GCC's actual output would have used multiplication by a magic constant (fixed-point multiplicative inverse) to divide by 9 (even at -O0, but this is clearly not debug-mode code. They could have used -Os).

Presumably they didn't want to talk about Why does GCC use multiplication by a strange number in implementing integer division? and replaced that block of code with their made-up instruction. From context you can probably figure out where they expect the output to go; perhaps they mean rcx /= 9.


These errors are from 3rd-party practice problems in the Global Edition

From the publisher's web site (https://csapp.cs.cmu.edu/3e/errata.html)

Note on the Global Edition: Unfortunately, the publisher arranged for the generation of a different set of practice and homework problems in the global edition. The person doing this didn't do a very good job, and so these problems and their solutions have many errors. We have not created an errata for this edition.

So CS:APP 3e is probably a good textbook, as long as you get the North American edition, or ignore the practice / homework problems. This explains the huge disconnect between the textbook's reputation and wide use vs. the serious and obvious (to people familiar with x86-64 asm) errors like this one that go beyond sloppy into don't-know-the-language territory.


How a hypothetical idiv reg, reg or idiv $imm, reg would be designed

Also, the dividend should be given from the quantity in registers %rdx (high-order 64 bits) and %rax (low-order 64 bits) - so if this is defined in the architecture then it does not seem possible that the second operand could be a specified dividend.

If Intel or AMD had introduced a new convenient forms for div or idiv, they would have designed it to use a single-width dividend because that's how compilers always use it.

Most languages are like C and implicitly promote both operands for + - * / to the same type and produce a result of that width. Of course if the inputs are known to be narrow that can be optimized away. (e.g. using one imul r32 to implement a * (int64_t)b).

But div and idiv fault if the quotient overflows so it's not safe to use a single 32-bit idiv when compiling int32_t q = (int64_t)a / (int32_t)b.

Compilers always use xor edx,edx before DIV or cdq or cqo before IDIV to actually do n / n => n-bit division.

Real full-width division using a dividend that isn't just zero- or sign-extended is only done by hand with intrinsics or asm (because gcc/clang and other compilers don't know when the optimization is safe), or in gcc helper functions that do e.g. 64-bit / 64-bit division in 32-bit code. (Or 128-bit division in 64-bit code).

So what would be most helpful is a div/idiv that avoids the extra instruction to set up RDX, too, as well as minimizing the number of implicit register operands. (Like imul r32, r/m32 and imul r32, r/m32, imm do: making the common case of non-widening multiplication more convenient with no implicit registers. That's Intel-syntax like the manuals, destination first)

The simplest way would be a 2-operand instruction that did dst /= src. Or maybe replaced both operands with quotient and remainder. Using a VEX encoding for 3 operands like BMI1 andn, you could maybe have
idivx remainder_dst, dividend, divisor. With the 2nd operand also an output for the quotient. Or you could have the remainder written to RDX with a non-destructive destination for the quotient.

Or more likely to optimize for the simple case where only the quotient is needed, idivx quot, dividend, divisor and not store the remainder anywhere. You can always use regular idiv when you want the quotient.

BMI2 mulx uses an implicit rdx input operand because its purpose is to allow multiple dep chains of add-with-carry for extended-precision multiply. So it still has to produce 2 outputs. But this hypothetical new form of idiv would exist to save code-size and uops around normal uses of idiv that aren't widening. So 386 imul reg, reg/mem is the point of comparison, not BMI2 mulx.

IDK if it would make sense to introduce an immediate form of idivx as well; you'd only use it for code-size reasons. Multiplicative inverses are more efficient division by constants so there's very little real-world use-case for such an instruction.

Gronseth answered 18/9, 2019 at 19:27 Comment(4)
csapp.cs.cmu.edu/3e/errata.html " Note on the Global Edition: Unfortunately, the publisher arranged for the generation of a different set of practice and homework problems in the global edition. The person doing this didn't do a very good job, and so these problems and their solutions have many errors. We have not created an errata for this edition." I think many of the errors come from this edition.Towards
@TomášKrál: Ah thanks, that explains a lot. I've heard it's a good textbook, and apparently popular, which seemed incompatible with the level and seriousness of the insane mistakes we've seen from it in SO questions.Gronseth
Incidentally, I found if you want code from gcc that you can understand, you usually want -Os.Lemire
@Joshua: Yeah, can be useful depending on what you want to see. With GCC that gets it to avoid the multiplicative inverse trick, even though that's not a smart size vs. speed tradeoff. I'd thought my answer on How to remove "noise" from GCC/clang assembly output? had mentioned -Os and -Og, but apparently I didn't before. Edited to add it.Gronseth
K
2

I think your book has made a mistake.

idivq only has one operand. If I try to assemble this snippet:

idivq $9, %rcx

I get this error:

test.s: Assembler messages:
test.s:1: Error: operand type mismatch for `idiv'

This works:

idivq %rcx

but you probably already know that.

It may also be a macro (unlikely, but possible. credit to @HansPassant for this).

Perhaps you should contact the book's author so that they can add an entry to the errata.

Karleenkarlen answered 18/9, 2019 at 19:3 Comment(1)
Thanks for the input! Regarding the macro comment, it must have been mentioned if it is, and it has not been mentioned. So alright, I now know that it is a reoccurring mistake :)Sugihara
C
1

Interestingly, gas seems to allow the following:

mov $20, %rax
mov $0, %rdx
mov $5, %rcx
idivq %rcx, %rax
ret

This is still performing the one operand division under the hood, but it LOOKS like two-operand form. As long as the first operand is a register and the second operand is specifically %rax, this works. However, in general idivq seems to require the one operand form.

Capitalist answered 9/10, 2019 at 6:44 Comment(2)
The only machine encoding of idiv is idiv r/m (felixcloutier.com/x86/idiv), with the RDX:RAX operand implicit. Interesting (and weird) that GAS / AT&T syntax allows you to optionally specify part of the implicit operand. It doesn't get treated as a pseudo-instruction that can expand to mov-immediate + idiv because AT&T syntax doesn't do that, and there's no tmp reg it could use for the divisor anyway.Gronseth
Unlike higher-level languages, what's allowed is (partly) determined by what's encodeable into machine code. It's not like GAS could decide to allow anything else. It doesn't compile, it merely assembles. There is some precedent for naming the implicit operands as documentation, e.g. with stosb vs. stos %al,%es:(%rdi) (AT&T) / stos BYTE PTR es:[rdi],al (Intel). felixcloutier.com/x86/stos:stosb:stosw:stosd:stosq mentions that some assemblers can indicate operand-size this way with a single operand, leaving AL/AX/EAX/RAX implicit.Gronseth

© 2022 - 2024 — McMap. All rights reserved.