Allowing the compiler to re-order the evaluation of the operands adds more room for optimization.
Here's a completely made up example for illustration purposes.
Suppose the processor can:
- Issue 1 instruction each cycle.
- Execute an addition in 1 cycle.
- Execute a multiplication in 3 cycles.
- Can execute additions and multiplications at the same time.
Now suppose you have a function call as follows:
foo(a += 1, b += 2, c += 3, d *= 10);
If you were to execute this left-to-right on a processor without OOE:
Cycle - Operation
0 - a += 1
1 - b += 2
2 - c += 3
3 - d *= 10
4 - d *= 10
5 - d *= 10
Now if you allow the compiler to re-order them: (and start the multiplication first)
Cycle - Operation
0 - d *= 10
1 - a += 1, d *= 10
2 - b += 2, d *= 10
3 - c += 3
So 6 cycles vs. 4 cycles.
Again this is completely contrived. Modern processors are much more complicated than that. But you get the idea.