What exactly is a dual-issue processor?
Asked Answered
K

2

38

I came across several references to the concept of a dual issue processor (I hope this even makes sense in a sentence). I can't find any explanation of what exactly dual issue is. Google gives me links to micro-controller specification, but the concept isn't explained anywhere. Here's an example of such reference. Am I looking in the wrong place? A brief paragraph on what it is would be very helpful.

Kriss answered 4/11, 2011 at 19:28 Comment(2)
I think the link you provided seems to talk about dual issue instructions rather than dual issue processors. It talks about what restrictions they have that would cause a second pipeline to be required in the next cycle. [infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0363e/… is a link to another article for the R-series processor documentation that talks about (potentially different?) dual issue instructions moreFelicity
electronics.stackexchange.com/questions/145473/…Kelson
S
57

Dual issue means that each clock cycle the processor can move two instructions from one stage of the pipeline to the next stage. Where this happens depends on the processor and the company's terminology: it can mean that two instructions are moved from a decode queue to a reordering queue (Intel calls this issue) or it could mean moving instructions (or micro-operations or something) from a reordering queue to an execution port (afaik IBM calls this issue, while Intel calls it dispatch)

But really broadly speaking it should usually mean you can sustain executing two instructions per cycle.

Since you tagged this ARM, I think they're using Intel's terminology. Cortex-A8 and Cortex-A9 can, each cycle, fetch two instructions (more in Thumb-2), decode two instructions, and "issue" two instructions. On Cortex-A8 there's no out of order execution, although I can't remember if there's still a decode queue that you issue to - if not you'd go straight from decoding instructions to inserting them into two execution pipelines. On Cortex-A9 there's an issue queue, so the decoded instructions are issued there - then the instructions are dispatched at up to 4 per cycle to the execution pipelines.

Schleswig answered 4/11, 2011 at 20:40 Comment(1)
Of course ARM and Intel got it all wrong. The terms dispatch and issue go back all the way to the 60s and the CDC 6600.Gabrielson
M
0

In simple terms dual issue means that the peak instruction execution rate of a CPU core is of two instructions per clock cycle.

This can be achieved with some sort of hardware instruction level parallelism (as explained with some nice examples in the other answer). Superscalar architectures are the typical way to achieve instruction level parallelism with duplication of execution units.

Methodism answered 31/10, 2023 at 15:7 Comment(10)
executing multiple instructions in the same execution unit. - Are you talking about Pentium 4 that runs its integer ALUs at double the clock speed, so each one can handle 2 uops per clock cycle, even ones dependent on each other? (So it's an IPC boost that doesn't require ILP.)Amaryl
Other than that, a normal execution unit can only start working one one instruction per clock cycle. A dual-issue CPU like P5 Pentium or Cortex-A53 might not duplicate everything, e.g. only one of the pipes might have an integer multiply unit, so it can issue an add and multiply in the same cycle, but not two multiplies. (chipsandcheese.com/2023/05/28/…). A pipelined execution unit can have multiple instructions in flight, but each one started in a different cycle.Amaryl
@PeterCordes I was thinking more of newer RISC-V architectures. See one description of a dual issue execution unit here for example: scs.stanford.edu/~zyedidia/docs/sifive/sifive-u74mc.pdf (page 34). Quote "The S7 execution unit is a dual-issue, in-order pipeline"Methodism
Look at the context. They're (strangely) using the term "execution unit" to talk about a whole physical core. Or the parts of the core other than the cache / tightly-integrated-memory. It is a superscalar in-order core very much like P5 Pentium or A53, able to execute two independent instructions in the same clock, subject to a list of requirements (like that at most one accesses memory). It could run add x0, x1, x1 in the same cycle as add x2, x2, x1 so it needs two integer execution units (ALUs).Amaryl
lighterra.com/papers/modernmicroprocessors is very good, covering pipelined and superscalar, but doesn't go into detail about defining the term "execution unit". Anyway, that SiFive manual is the only time I've ever seen "execution unit" used to describe a whole superscalar CPU pipeline. Everyone else agrees that it's just one part of a pipeline that does the actual execution.Amaryl
@PeterCordes your Cortex-A53 example is very similar to the RISC-V example I pointed to. However I don't see any mention of superscalar in the official ARM Cortex-A53 TRM. Same as my RISC-V example, ARM only mentions dual-issue. That is why I believe it is a way to highlight that not everything is duplicated, as I explained in my original answer. While you might disagree with the mentioned RISC-V spec would you at least agree that my answer is reflecting their terminology? (as is how I ended up looking at this StackOverflow question in the first place)Methodism
However I don't see any mention of superscalar in the official ARM Cortex-A53 TRM. I don't see how that's relevant. Dual-issue is a more specific term than superscalar so there's no need for an ARM manual about a specific core to use it. All cores capable of executing more than 1 instruction per clock cycle are superscalar, that's what that term means in the taxonomy of computer architectures.Amaryl
No, I wouldn't agree that your answer reflects the manual for that one RISC-V implementation. (Not the RISC-V spec, where an execution context is called a "hart", or hardware thread, and it doesn't say anything about the internals of a core.)Amaryl
Anyway, your answer starts out using normal terminology where a superscalar CPU has multiple "execution units", including more than one of some types. (e.g. realworldtech.com/sandy-bridge/6 has a diagram of the execution units on various ports in SnB). But then you bring up this alternate terminology where "execution unit" is the whole core other than the front-end. And you write as if that was a different hardware design, rather than just different terminology to describe the same thing. And as if that was some kind of alternative to superscalar CPUs.Amaryl
@PeterCordes Thanks for the feedback! I agree, I removed the second part.Methodism

© 2022 - 2024 — McMap. All rights reserved.