When and why do we sign extend and use cdq with mul/div?
Asked Answered
F

1

13

I had a test todayand the only question I didn't understand involved converting a doubleword to a quad word.

That got me thinking, why/when do we sign extend for multiplication or division? In addition, when do we use instructions like cdq?

Frieda answered 7/4, 2016 at 0:57 Comment(0)
W
27

Use cdq / idiv for signed 32-bit / 32-bit => 32 bit division,
xor edx,edx / div for unsigned.

With the dividend in EAX to start with, and the divisor specified as an operand to DIV or IDIV.

   mov  eax, 1234
   mov  ecx, 17
   cdq                   ; EDX = signbit(EAX)
   idiv  ecx             ; EAX = 1234/17     EDX = 1234%17

If you zero EDX/RDX instead of sign-extending into EDX:EAX before idiv, you can get a large positive result for -5 / 2, for example.

Using the "full power" of 64 / 32-bit => 32-bit division is possible, but not safe unless you know the divisor is large enough so the quotient doesn't overflow. (i.e. you can't in general implement (a*b) / c with just mul / div and a 64-bit temporary in EDX:EAX.)

Division raises an exception (#DE) on overflow of the quotient. On Unix/Linux, the kernel delivers SIGFPE for arithmetic exceptions including divide errors. With normal sign or zero-extended divide, overflow is only possible with idiv of INT_MIN / -1 (i.e. the 2's complement special case of the most negative number.)


As you can see from the insn ref manual (link in the tag wiki):

  • one-operand mul / imul: edx:eax = eax * src
  • two-operand imul: dst *= src. e.g. imul ecx, esi doesn't read or write eax or edx.

  • div / idiv: divides edx:eax by the src. quotient in eax, remainder in edx. There's no form of div / idiv that ignores edx in the input.
  • cdq sign-extends eax into edx:eax, i.e. broadcasts the sign bit of eax into every bit of edx. Not to be confused with cdqe, the 64-bit instruction that is a more compact form of movsxd rax, eax.

    Originally (8086), there was just cbw (ax = sign_extend(al)) and cwd (dx:ax = sign_extend(ax)). The extensions of x86 to 32bit and 64bit have made the mnemonics slightly ambiguous (but remember, other than cbw, the within-eax versions always end with an e for Extend). There is no dl=sign_bit(al) instruction because 8bit mul and div are special, and use ax instead of dl:al.


Since the inputs to [i]mul are single registers, you never need to do anything with edx before a multiply.

If your input is signed, you sign-extend it to fill the register you're using as an input to the multiply e.g. with movsx or cwde (eax = sign_extend(ax)). If your input is unsigned, you zero extend. (With the exception that if you only need the low 16 bits of the multiply result, for example, it doesn't matter if the upper 16 bits of either or both inputs contain garbage.)


For a divide, you always need to zero or sign extend eax into edx. Zero-extending is the same as just unconditionally zeroing edx, so there's no special instruction for it. Just xor edx,edx.

cdq exists because it's a lot shorter than mov edx, eax / sar edx, 31 to broadcast the sign bit of eax to every bit in edx. Also, shifts with immediate count > 1 didn't exist until 186 and were still 1 cycle per count, so on 8086 you'd have to do something even worse (like branch, or rotate the sign bit to the bottom and isolate + neg it). So cwd in 8086 saved a lot of time/space when it was needed.


In 64bit mode, sign and zero extending 32bit values to 64bit is common. The ABI allows garbage in the high 32bits of a 64bit register holding a 32bit value, so if your function is only supposed to look at the low 32bits of edi, you can't just use [array + rdi] to index the array.

So you see a lot of movsx rdi, edi (sign extend), or mov eax, edi (zero-extend, and yes it's more efficient to use a different target register, because Intel mov-elimination doesn't work with mov same,same)

Wilford answered 7/4, 2016 at 1:18 Comment(4)
Sorry - I always get mixed up with division in Assembly because I get confused with the registers. I thought that that the dividend was always placed in eax/ax and the one-operand instruction was just div/idiv ebx (or whatever register). Which would perform effectively eax / ebx with the quotient in eax and the remainder in edx. My exam showed us using cdq before we called idiv on EAX containing 71 and another register containing -4. Why is this? We were using the entirety of each register I don't understand why we needed one of them to be a quadword.Frieda
@Koronakesh: Read the first line of my answer, and/or Intel's insn ref manual. idiv ebx does eax = (edx:eax)/ebx and eax = (edx:eax)%ebx. edx is always the high half of the dividend, and the explicit operand is always the divisor. There's no form of div / idiv that ignores edx the way the 2 and 3-operand forms of imul only produce a single-register result.Wilford
Okay - this is making sense now. Are there requirements on the size of the dividend compared to the divisor? Also, do instructions like cdq exist simply because it's 1 byte less costly than something like sub edx, edx?Frieda
@Koronakesh: cdq exists because it's a lot shorter than mov edx, eax / sar edx, 31 to broadcast the sign bit of eax to every bit in edx. xor edx,edx zero-extends, which is different from sign-extending. Also, shifts with count > 1 didn't exist until 286, so it would have been really horrible to need a loop. As for size limits, yes, if you read the instruction reference manual, you'll see that div faults if the quotient overflows the operand-size (e.g. 32bits).Wilford

© 2022 - 2024 — McMap. All rights reserved.