Difference b/w hyper threading and multithreading?

Asked 3/1, 2013 at 16:49 Answered 29/9, 2017 at 18:22

Solved multithreading cpu-architecture hyperthreading

I was wondering if someone could explain me the difference b/w these two ? Has it something to do with intel hardware architecture (HT) ?

Consonance answered 3/1, 2013 at 16:49 Comment(1)

possible duplicate of multi-CPU, multi-core and hyper-thread – Gormley 3/1, 2013 at 16:52

Hyperthreading is a hardware thing and Intel branding. Most other people call it Simultaneous Multithreading (SMT). To the programmer, two hyperthreads look like two CPU cores. On the hardware side, multiple hyperthreads share a single core. (In the case of intel, there are two hyperthreads per core).

Multithreading (or multithreaded programming) is generally considered the concept of using more than one thread context (instruction pointer, registers, stack, etc.) in a single program. (Usually in the same process or virtual address space).

Bowe answered 7/1, 2013 at 6:14 Comment(2)

Technically "Hyper-Threading Technology" was applied to Switch-on-Event Multithreading in Itanium. Fine-grained multithreading (where instructions from only one thread begin execution each cycle) is another type of hardware multithreading (so far Intel has not used the Hyper-Threading Technology name for this). – Avrom 19/8, 2014 at 13:54

That's just false. HT on x86 is SMT. Go check out the CPUID instrution in x86. You can find it in: "Intel® 64 and IA-32 Architectures Developer's Manual: Vol. 2A" – Bowe 20/8, 2014 at 2:6

A physical processor (PP) is the hardware implementation of a single processing unit. From this perspective, a "core" is the basic PP. Sometimes, terms such as multi-processor, multi-core are used to differentiate how processing units are organized in chips, and what other physical resources are shared among them, like L2, buses, etc. But for this answer, we are interested in the most basic processing unit.

When a PP supports hyperthreading (let's just use this term for now), the PP is split into two or more logical processors (LP). This is done by beefing up the execution pipeline, duplicating PP resources like register set, PC, interrupt handling mechanism, and others. This allows the PP to hold and execute several "execution context" at the "same time". These execution context are sometimes called hardware threads (HT). If the PP does not support hyperthreading (or it's turned off), the LP is the same as the PP.

A software thread (ST) is an execution context created by the software, for instance with pthread_create() or clone(). These entities are scheduled by the operating system onto processors. A multithreaded program is a code in which the programmer explicitly creates ST. A multithreaded program can run in a processor that does not support hyperthreading. In this case, context switching among ST is expensive, because it requires the intervention of the scheduler and the use of memory to store and load execution contexts.

When hyperthreading is on, the OS schedules several ST to one PP. Usually one ST per LP. The OS sees LPs as if they were real PP. Thus, each ST will run on a different LP. Once STs have been scheduled, we can say they become hardware threads (HT) (loosely speaking) in the sense that the PP takes control. When one HT stalls, for instance on a cache miss or pipeline flushing, the PP executes other HT. This "context-switch" costs almost nothing since the HT's context is already in the PP. The OS is not involved in these context-switchings. What is most relevant, it is that these stalls and corresponding context-switches can happen in many stages of the pipeline. This is different to scheduler-based context switching which happen on interrupt-based events, such as quantum expiration, I/O interrup, abort, system calls, etc.

As Nathan says in the previous answer, hyperthreading is a a very specific term. A more general and agnostic term is "Simultaneous Multithreading (SMT)".

Finally, I strongly recommend reading: 1) Operating system support for simultaneous multithreaded processors. James R. Bulpin

2) Microarchitecture choices and tradeoffs for maximizing processing efficiency. Deborah T. Marr (Ph.D. dissertation)

Leilaleilah answered 29/9, 2017 at 18:22 Comment(3)

Intel's HT on Sandybridge-family isn't just cheap HW context switch on stall. The front-end (fetch / decode / issue+rename into the out-of-order core) alternates between the two threads every clock cycle. (Unless one thread is blocked on a cache-miss or branch mispredict, then the other thread gets the full front-end bandwidth). uops from both threads can be in flight in the OOO core, executing in the same cycle; many resources are competitively shared between cores (e.g. cache), but others are statically partitioned (e.g. the ROB, so one thread can't starve the other when it stalls). – Picky 29/9, 2017 at 18:48

AMD's CMT in Bulldozer and SMT in Ryzen work the same way, too. See agner.org/optimize, and other links in the x86 tag wiki for more details. It's what these slides call "fine-grained multithreading": cis.upenn.edu/~milom/cis501-Fall05/lectures/12_smt.pdf. Those slides reserve the term SMT for when instructions from different threads issue in the same cycle, which the mainstream desktop x86 cores never do. (And I don't think Silvermont / Knight's Landing 4-way SMT does that either). – Picky 29/9, 2017 at 18:50

The first half of your answer is good, though. Welcome to Stack Overflow. :) Oh hmm, actually those slides are talking about in-order pipelines, and their FGMT can't mix uops from threads after they issue like HT can. And the scheduling in HT is dynamic; when one thread doesn't have any uops ready to issue, the other gets all the bandwidth. So those slides are describing an SMT that's pretty much like HT. – Picky 29/9, 2017 at 18:55

Recommended topics

Hot tags