What is the difference between a Cluster and MPP supercomputer architecture?

Asked 6/4, 2011 at 18:3 Answered 3/7, 2017 at 19:12

Tooley answered 6/4, 2011 at 18:3 Comment(0)

In a cluster, each machine is largely independent of the others in terms of memory, disk, etc. They are interconnected using some variation on normal networking. The cluster exists mostly in the mind of the programmer and how s/he chooses to distribute the work.

In a Massively Parallel Processor, there really is only one machine with thousands of CPUs tightly interconnected. MPPs have exotic memory architectures to allow extremely high speed exchange of intermediate results with neighboring processors.

The major variants are SIMD (Single Instruction, Multiple Data) and MIMD (Multiple Instruction, Multiple Data). In a SIMD system, every processor is executing the same instruction at the same time, only on different bits of memory. Essentially, there is only one Program Counter. In a MIMD machine, each CPU has it's own PC.

MPPs can be a bitch to program and are of use only on algorithms that are embarrassingly parallel (that's actually what they call it). However, if you have such a problem, then an MPP can be shockingly fast. They are also incredibly expensive.

Belch answered 6/4, 2011 at 18:18 Comment(6)

I much more agree with the ang mo's answer below. Today's MPPs in TOP500 have (typically, if not all) hybrid distributed-shared memory architectures. For programmers there is no difference, MPI (+OpenMP,CUDA,...) is mostly used in practice. Sometimes, toplogy-aware codes can perform better, but at the cost of portability loss. – Ritenuto 21/8, 2014 at 13:23

You are probably correct. Unfortunately the paper referenced by ang mo is behind a paywall, so I can't comment on it directly. My answer was, admittedly, based on my own experiences dating back quite a few years (e.g. 1990 MasPar machines). The present-day technology that is most closely related to my answer is probably the parallel stream processing in a modern GPGPU. I believe that the one enduring constant is that using any sort of MPP requires the programmer to think very differently about how their problem should be attacked. – Belch 21/8, 2014 at 18:5

Sure, today's terminology is a bit different. Machines like BG/Q or Cray XC are considered MPPs. Wikipedia states that MPPs have many of the same characteristics as clusters, but MPPs have specialized interconnect networks (whereas clusters use commodity hardware for networking), which is the terminology I would adopt. – Ritenuto 22/8, 2014 at 7:35

You are the LMGTFY answer now :) – Philippe 6/11, 2015 at 18:22

@RonE: weirdly enough I also seem to have that status for a page I wrote about 7 or 8 years ago on fixing a Kenmore Electric Dryer. I've received over a 100 "Thank You!" emails from people who found out they could save big bucks by doing a little diagnostic work on their own. – Belch 6/11, 2015 at 21:48

Dongarra et al. reference here: netlib.org/utk/people/JackDongarra/PAPERS/… . @ang mo Thanks for the paper reference. – Reliquiae 30/12, 2015 at 13:48

The top500 list uses a slightly different distinction between an MPP and a cluster, as explained in Dongarra et al. paper:

[a cluster is a] parallel computer system comprising an integrated collection of independent nodes, each of which is a system in its own right, capable of independent operation and derived from products developed and marketed for other stand-alone purposes

Compared to a cluster, a modern MPP (such as the IBM Blue Gene) is more tightly-integrated: individual nodes cannot run on their own and they are connected by a custom network (like a multidimensional torus). But, similarly to a cluster, there is no single, shared memory spanning all the nodes (note: an MPP might be hierarchical and shared memory might be used inside a single node (NUMA), or between a handful of nodes).

I'd be thus extremely careful to use terms SIMD and MIMD in this context as they usually describe shared memory architectures (SMP).

Update:

Dongarra et al. link

Update: MPP can have nodes that use shared memory internally; but the whole MPP memory is not shared.

Rositaroskes answered 18/2, 2014 at 11:16 Comment(1)

I would refute that MPP don't have shared memories. Nodes in an MPP, for example in SGI Altix or Cray T3E which use CC-NUMA and NCC-NUMA technology implement/use a distributed shared memory (DSM). – Kennie 8/8, 2018 at 16:30

A cluster is a bunch of machines, normally usually Ethernet interconnect (read: network), each running it's own and separate copy of an OS which happen to serve a single purpose.

An MPP supercomputer usually implies a faster propitiatory very fast interconnect (e.g. SGI NUMALink) that supports either Distributed Shared Memory (run processes on different MPP nodes that use shared memory over the fast interconnect to share data as if they were running on a single computer) or even a Single System Image (a single instance of an operating system, mostly Linux, running on all the nodes at the same time as if on a single machine - e.g. "ps aux" on any node will show you all the processes running on the MPP).

As you can see the definition is quite fluid, it's more a question of scale rather than clear cut differences.

Satellite answered 6/4, 2011 at 18:14 Comment(0)

I've searched in a lot of HPC literature and couldn't find a concrete definition of MPP. There is quite a concesus over a cluster consisting of multiple interconnected regular personal computers or workstations, usually coupled with standard technologies (like Ethernet or open-source operating systems). The term MPP is usually applied to more propietary approches for building distributed-memory computers, usually having propietary technologies.

For example: Tianhe-2 is considered a cluster because it uses x86-64 nodes and a regular operating system (Kylin Linux). Sunway TaihuLight is considered an MPP because its nodes have its particular architecture, SW26010, and work over his own operating system called Sunway Raise OS.

The most concrete explanation of this matter I found was in Sourcebook of Parallel Computing (Dongarra et al.):

We note that the term cluster can be applied both broadly (any system built with a significant number of commodity components) or narrowly (only commodity components and open-source software). In fact, there is no precise definition of a cluster. Some of the issues that are used to argue that a system is a massively parallel processor (MPP) instead of a cluster include proprietary interconnects (...), particularly ones designed for a specific parallel computer, and special software that treats the entire system as a single machine, particularly for the system administrators. Clusters may be built from personal computers or workstations (either single processors or symmetric multiprocessors (SMPs)) and may run either open-source or proprietary operating systems.

Pellitory answered 3/7, 2017 at 19:12 Comment(0)

Recommended topics

Hot tags