What's the purpose of the rotate instructions (ROL, RCL on x86)?
Asked Answered
C

6

32

I always wondered what's the purpose of the rotate instructions some CPUs have (ROL, RCL on x86, for example). What kind of software makes use of these instructions?

I first thought they may be used for encryption/computing hash codes, but these libraries are written usually in C, which doesn't have operators that map to these instructions. (Editor's note: see Best practices for circular shift (rotate) operations in C++ for how to write C or C++ that will compile to a rotate instruction. Also, optimized crypto libraries often do have asm for specific platforms.)

Has anybody found an use for them? Why where they added to the instructions set?

Casandra answered 12/2, 2011 at 6:7 Comment(7)
Actually a good C compiler will emit rol opcodes when compiling code which tries to compute a rotation with the C operators (i.e. (x << 12) | (x >> 20)).Trilateral
@Brian: I wrote rol, I meant rol (well, could be ror). The rotation opcode.Trilateral
@Thomas My C is rusty, I was thinking the << and >> operators were shift and not rotate.Agler
@Brian: << and >> are shifts. But for a 32-bit value x, the whole expression (x << 12) | (x >> 20), consisting of two shifts (one left, one right) and a bitwise OR, has the same effect than a rotation of a 32-bit word (here, by 12 bits to the left). C compilers are smart enough to notice it, and compile the complete expression as a single rol.Trilateral
Some libraries have bit rotate intrinsics, but I also think C should have rotate operators at first. It will make understanding the code much easier and the compiler would have less work to do.Mahmoud
I've actually just coded an assembly routine where I needed to manipulate the Alpha channel in a TColor (32-bit RGBA with A in the high nibble). The easy way to do this was ROL EAX,8 / MOV AL,Value / ROR EAX,8 instead of AND EAX,$00FFFFFF / MOV DL,Value / SHL EDX,24 / OR EAX,EDX (assuming that Value isn't a constant but a value that may be different on each iteration). I ended up with BSWAP EAX / MOV AL,Value / BSWAP EAX instead, but that's neither here nor there :-).Misdemean
High BYTE - not nibble :-)Misdemean
D
30

Rotates are required for bit shifts across multiple words. When you SHL the lower word, the high-order bit spills out into the carry. To complete the operation, you need to shift the higher word(s) while bringing in the carry to the low-order bit. RCL is the instruction that accomplishes this.

                      High word             Low word         CF
Initial          0110 1001 1011 1001   1100 0010 0000 1101    ?
SHL low word     0110 1001 1011 1001   1000 0100 0001 1010    1
RCL high word    1101 0011 0111 0011   1000 0100 0001 1010    0

ROL and ROR are useful for examining a value bit-by-bit in a way that is (ultimately) non-destructive. They can also be used to shunt a bitmask around without bringing in garbage bits.

Druci answered 12/2, 2011 at 6:45 Comment(6)
When would use rotation to test bits instead of BT?Gutow
When you want to test them all and, perhaps, in order.Druci
Or alternatively, when you don't have BT to begin with.Druci
Rotates are only effective when shifting only 1 bitMahmoud
Wouldn't CF be 0 after the third step? (the bit that goes off is set to CF and previous value of CF is inserted to the right-most position)Sansculotte
"In assembly languages these instructions are represented by mnemonics such as ADD/SUB, ADC/SBC (ADD/SUB including carry), SHL/SHR (bit shifts), ROL/ROR (bit rotates), RCR/RCL (rotate through carry), and so on. [1] The use of the carry flag in this manner enables multi-word add, subtract, shift, and rotate operations." en.wikipedia.org/wiki/Carry_flagMahmoud
C
18

The rotate shift opcodes ROL, RCL, ROR, RCR) are used almost exclusively for hashing and CRC computations. They are pretty arcane and very rarely used.

The shift opcodes (SHL, SHR) are used for fast multiplication by powers of 2, or to move a low byte into a high byte of a large register.

The difference between ROL and SHL is ROL takes the high bit and rolls it around into the low bit position. SHL throws the high bit away and fills the low bit position with zero.

Corny answered 12/2, 2011 at 6:44 Comment(6)
I don't see how you answered the question.Gutow
maybe you can add the difference to ROL/RCL and ROR/RCR in your answer too.Marylandmarylee
Pretty arcane and rarely used? Really? There are many places where rotation is useful, especially in, as you say hashing and cryptography. On many CPU's where the amount shifted affects time, it's actually faster to rotate and bitwise and rather than doing a shift.Toggery
Yes, very rarely used. Hashing and crypto are things to be used from libraries, not something every developer should write for themselves.Corny
Note that from a CPU design perspective (which instructions to provide), the relevant measure is how frequently it is (or would be) executed, not how many different pieces of software will contain the instruction. It's not that hard to emulate (unlike some special-purpose instructions like popcnt or crc32 or SIMD psadbw which was added basically for video-encode motion-search), but OTOH it doesn't take much extra hardware to make a barrel shifter capable of rotating.Estes
ROR can be used for swaps during certain sorts, if the values are adjacent in memory. Works great.Mccormick
S
9

ROR ROL are "historic" but still useful in a number of ways.

Before the 80386 (and opcode BT), ROL would be used a lot to test a bit (SHL doesn't propagate to the carry flag) - actually in 8088, ROR/ROL would only shift by 1 bit at a time !!!!

Also if you want to shift one way and then the other way without loosing the bits that have been shifted out of scope, you'd use ROR/ROL instead of SHR/SHL

Serg answered 12/2, 2011 at 6:53 Comment(3)
And the 8080 didn't even have shift instructions -- rotate was all you got!Gutow
It is right that on the 8088 you can only rotate by one, if you use an immediate rotate count. However, the 8088 does support rotating by a count given in the register cl too. (Immediate byte shift/rotate counts other than 1 were added in the 186 instruction set.)Beautician
ROL sets CF the same way SHL does. (according to the last bit shifted left out of the high bit). The only difference is that ROL shifts the bit in the bottom instead of shifting in zeros. (And in SHL setting SF, ZF, and PF like a normal ALU instruction, unlike rotates.)Estes
A
4

If I understand you correctly, your question is this:

"Given the fact that rotation instructions seem to be very special-purpose and not emitted by compilers, when are they actually used and why are they included in CPUs?".

The answer is twofold:

  1. CPU's are not designed specifically to execute C programs. Rather, they are designed as general purpose machines, intended to solve a wide array of problems using a wide variety of different tools and languages.

  2. The designers of a language are under no obligation to use every opcode in the CPU. In fact, most of the time, they do not, because some CPU instructions are highly specialized, and the language designer has no pressing need to use them.

More information about bitwise operators (and how they relate to C programming) can be found here: http://en.wikipedia.org/wiki/Bitwise_operation

Allodium answered 12/2, 2011 at 6:34 Comment(0)
G
3

Back when microprocessors were first created, most programs were written in assembly, not compiled. The majority of CPU instructions are probably not emitted by compilers (which is the impetus for creating RISC), but are often relatively easy to implement in hardware.

Many algorithms in graphics and cryptography use rotation, and their inclusion in CPUs makes it possible to write very fast algorithms in assembly.

Gutow answered 12/2, 2011 at 6:46 Comment(0)
A
1

I think many answers here got it somewhat backwards, including the currently accepted one. The biggest application is in shifting data across byte/word boundaries, which is extensively used in

  • extracting and inserting bit patterns
    • protocols (insert 5 bits starting from bit 6)
    • compression schemes (LZW77 and more)
    • data transfer (300 baud modems anyone? 7-bit data + parity)
  • arbitrary precision arithmetic
    • multiplying/dividing by 2 utilises rotations-through-carry
    • multiplying/dividing by other powers of two need the ROL (or ROR)
    • scrolling 1-bit graphics horizontally

And the niche applications:

  • crc16/32
  • ciphers
  • non-destructive moving bits to sign bit or to carry for testing

The historical perspective is that shifting was expensive: when one needs to shift say 16 bits left by 3, in chunks of 8 bits (or 128 bits left in chunks of 64 bits), a ROL performs two expensive shifts at the cost of one:

rotate all bits left by 3
      hi       lo
src = fedcba98|76543210
dst = cba98765|43210---

Notice, that the bits "765" need to be shifted right by 5, while bits "43210" need to be shifted left by 3. This is all accomplished by a single rotation, which put all the right bits to the correct position, even if they are accompanied by the wrong bits, which are recombined by masking, which is an inexpensive operation:

dst_lo = ((src_lo ROL 3) & 0b11111000)
dst_hi = ((src_lo ROL 3) & 0b00000111) | (src_hi << 3)

This extends to bignum shifting, or scrolling a monochrome graphics plane horizontally by arbitrary number of pixels.

This algorithm is so essential, that 80386 included a double-rotate instruction for it.

Adelleadelpho answered 11/1, 2022 at 17:25 Comment(2)
Many CPUs have a rotate-through-carry flag which you could use for equivalent purposes, if you're limited to shifting 1 bit at a time. That also enabled variable-count shifts across register boundaries using a loop, which wouldn't be possible with ROL without another shift (and NOT) to create those masks. Still, yes, interesting point for constant shift-counts on machines which have multi-bit shifts that are faster than looping but still slow. (Like 8086). However, you'd optimize to src_hi << 3 instead of ROL + mask, since the bits shifted out there aren't shifted into anything.Estes
Yes, it's definitely worth it to have the destructive variants (arithmetic and logical shifting right / logical shift left) for those cases that need it; And indeed the last word benefits from those instructions. I suppose I wanted to extend the concept to really-multi-word shifting before editing.Adelleadelpho

© 2022 - 2024 — McMap. All rights reserved.