Are PUSH/POP instructions considered RISC or CISC?

Asked 8/12, 2016 at 2:55 Answered 4/1, 2021 at 10:36

assembly cpu-architecture instruction-set risc

I was asked in an interview if PUSH and POP are RISC or CISC instructions. I said that they were RISC, but they told me that they were actually CISC instructions. I suggested that ARM (a common RISC implementation) has these instructions, but they pointed out that ARM is mixed and not purely RISC anymore.

I can't find any definitive proof one way or the other online. Are PUSH and POP instructions really considered a hallmark of a CISC architecture, or would they be found on a RISC system? Why?

Inextinguishable answered 8/12, 2016 at 2:55 Comment(5)

I did and they say Arm is mixed and it is not purely RISC anymore – Inextinguishable 8/12, 2016 at 2:59

ARM has always supported performing a push or pop with a single instruction, but not through a specialized PUSH/POP instruction. It was done through automatic post-increment and pre-decrement addressing modes. – Mumford 8/12, 2016 at 4:0

On x86/x86-64, push and pop are heavily optimized. So optimized that they special case the dependency on the stack pointer for consecutive pushes - the second push won't wait until the stack pointer is updated by the first push. Push is simply a store to a (usually) pre-decremented stack pointer register. Is a store with a certain addressing mode specifically CISC or RISC? It's arguable. – Moynihan 8/12, 2016 at 4:21

@doug65536: a clearer explanation of the "stack engine" that lets push/pop be efficient by tracking an offset from the value of RSP that's live in the out-of-order core: #36632076. Fortunately AMD and Intel both have very similar stack engines, thanks to their patent-sharing agreement. – Thomajan 8/12, 2016 at 4:25

@Jonas This question isn't about a specific architecture, and the risc tag actually does apply. (You could argue that it's off-topic for stack overflow, since it's about CPU-architecture taxonomy (i.e. philosophical debates about naming), not programming...) That said, don't go removing it from questions where it does apply until / unless its burnination been agreed on meta. – Thomajan 4/1, 2021 at 10:16

RISC means "reduced instruction set" (typically LOAD reg, STORE register, ADD register, CMP register, Branch conditional + a few others).

The notion, and experience, is that complex instructions often do not achieve a useful effect that cannot be achieved by simpler instruction sequences, especially if the extra logic that would be used to implement such complex instructions is instead invested in making the simple RISC instructions run faster.

PUSH and POP are basically simple combinations of STORE/LOAD indirect, and ADD a constant to a register. So if one dedicates register for a stack pointer, PUSH and POP are easily simulated and a fast pipelined machine can probably execute PUSH and POP about as fast as the corresponding RISC instructions. So most consider PUSH and POP to be CISC instructions; they don't really buy you a lot.

Life gets more interesting if you consider CALL (== PUSH PC + JMP) and RET (POP PC). These are also easy to simulate on the right RISC architecture. However, POP PC incurs a pipeline bubble because the processor has a hard time predicting where the new PC will be, and so can't do a prefetch. With memory being "far away in time", this can be a major performance inhibitor in code with lots of subroutine calls.

Here, one sort of wants to go CISC. What you really want is some way to predict that return PC. Many modern CPUs do this by keeping a "shadow call stack" in the hardware. Each CALL pushes the PC into the memory stack, and also onto the shadow stack; each RET pops a PC value from the memory stack, but predicts instruction stream flow using the top entry of the shadow stack which it has essentially zero-time access to (and of course, pops the shadow stack). This way the instruction stream doesn't get interrupted, and the CISC machine thus wins on performance.

(One wonders if a RISC machine with lots of registers, that compiled leaf function calls to always use a register to store the return PC, might not be as effective as a shadow stack. Sun Sparc sort of does this with its register window).

What this tells us is that RISC vs. CISC oversimplifies the design tradeoff. What you want is simple, unless more complexity actually buys you something. For example, IEEE floating-point in hardware is lots faster than any simulation using RISC instructions.

As a consequence, most modern machines are not neatly RISC or CISC. Performance profiling chooses.

Nisse answered 8/12, 2016 at 3:6 Comment(1)

ARM is a good example of these tradeoffs. Mostly RISC (load/store machine) but deviating from RISC dogma when it's worth it for code-size: push/pop instructions with a bitfield of which registers. (In ARM mode, not thumb, store-multiple / load-multiple can use any register as the base.) And some not-so-simple addressing modes. ARM is a lot less RISCy than MIPS, for example. – Thomajan 1/2, 2020 at 3:47

"RISC" means different things to different people. Definitions I've seen include:

a reduced number of instructions
fixed size instructions (possibly lots of them)
a load/store architecture (possibly with lots of variable sized instructions)
a CPU that doesn't translate instructions into micro-ops and doesn't have "micro-code" (e.g. 8086, 8088, 80186, ..., but not 80486, Pentium, ...)
any combination of 2 or more of the things above
anything that sacrifices performance for the sake of reducing development costs

Are PUSH/POP instructions considered RISC or CISC?

Yes (PUSH/POP instructions are considered RISC, or CISC, or both, or neither; depending on which definition of RISC or CISC is being applied when).

Mostly the question itself is a false dichotomy; like asking "is grey black or white?".

Toratorah answered 4/1, 2021 at 10:36 Comment(5)

I think the most important aspect of a RISC that's actually common across modern real-world RISCs (some of which which don't care much for ivory-tower RISC purity) is the max complexity of a single instruction: they can be done by a single (pipelined) execution unit, like an FP multiply or integer add, but not something like x86's rep movsb or ARM's push {r0, r1, r4, lr} (with a bitmap of which regs to push, making it near-impossible to run in constant time. ARM is not as RISCy, and that's one of its exceptions to standard RISC purity; related.) – Thomajan 4/1, 2021 at 23:44

I've seen this referred to as Reduced Insn Set Complexity, which is a more useful expansion of the acronym IMO. It implies being a load-store ISA, and definitely not having any memory-destination RMW instructions, because that would mean multiple different execution units. – Thomajan 4/1, 2021 at 23:46

@PeterCordes: To be honest; I think "RISC" was based on the idea that a simple CPU can run at higher frequencies and get the same amount of work done as a complex CPU (at lower frequencies); and that (at the end of 1990s) everyone found that increasing clock frequency is a massive disaster and started "creatively redefining" what "RISC" means in an attempt to salvage a decade of marketing hype; resulting in "RISC" CPUs that are significantly more complex than CISC originally was (as "increasing parallelism" replaced "increasing clock frequency" as a means of performance improvement). – Toratorah 5/1, 2021 at 0:34

Yeah that's fair, early RISC philosophy did often favour keeping the HW simpler even if that hurt per-clock performance. I guess my definition boils down to "pipelines well, especially superscalar/OoO", and or because the way early RISC machines achieved RISC-ness had that benefit which allowed high clocks (and pipe width even before the power wall, e.g. R10k). It's not that much of a stretch: the same things that "pipeline well" in a scalar in-order pipeline are still good, including fixed-width instructions for easy fetch/decode then allowing efficient parallel decode now. – Thomajan 5/1, 2021 at 0:44

It's only in the finer details where the philosophy really shifted. Back then, yeah keep HW simple so it can clock higher, rather than just still pipeline well enough to get more work done with fewer ops in the pipeline. e.g. Andy Glew commenting on lack of HW support for TLB consistency within the uops of one instruction making memory-destination adc cost extra uops: He said I was a RISC proponent when I joined P6, and my attitude was "let SW (microcode) do it". as part of a comment thread (quoted here) about regretting that choice. – Thomajan 5/1, 2021 at 0:49

Recommended topics

Hot tags