What is general difference between Superscalar and out-of-order (OoO) execution?

Superscalar microprocessors can execute two or more instructions at the same time. E.g. typically they have at least 2 ALUs (although a superscalar processor might have 1 ALU and some other execution unit, like a shifter or jump unit.)

More precisely, superscalar processors can start executing two or more instructions in the same cycle. Pipelined processors can execute more than one instruction at a time, but a non-superscalar pipelined processor will only start a single instruction in any given cycle. Pipelined execution units take multiple cycles to execute end to end. Put another way: superscalar processors are usually capable of executing two non-pipelined instructions with single cycle latency per cycle, whereas non-superscalar pipelined processors cannot have two single cycle instructions in execution in the ALUs at the same time.

Out-of-order processors can execute instructions out of the original order. For example, in the following, where MULTIPLY takes 5 cycles, instruction 3 may execute before instruction 2 - because instruction 2 is waiting for the 5 cycle result of the MULTIPLY of instruction 1:

1: MULTIPLY reg1 := reg2 * reg3
2: ADD reg4 := reg1 + 5
3: ADD reg6 := reg2 + 1

Most out-of-order processors are also superscalar. However you can imagine building an out-of-order processor that is not superscalar, that can only initiate one operation on a pipelined ALU per cycle. (I have proposed such operations, when employed by Intel, as low power chips. Heck, you can build out-of-order processors that are only half-way scalar, e.g. that have only a 16 bit wide ALU, taking 2 cycles for a 32 bit add, etc. But that's stretching.)

Many superscalar processors, however, are not out-of-order. In the example above, an in-order superscalar would execute instruction 1 first. It would not start instruction 3, but would wait until instruction 2 could start - at which time it would start instruction 2 and 3 together.

Sometimes you have to think about unlikely limit cases, such as 1-wide or half-wide OOO machines, to understand the concepts.

Recommended topics

Hot tags