What is the difference between the SIMD and Muti-threading concepts that one comes across in parallel programming paradigm?
SIMD means "Single Instruction, Multiple Data" and is an umbrella term describing a method whereby many elements are loaded into extra-wide CPU registers at the same time and a single low-level instruction (such as ADD, MULTIPLY, AND, XOR) is applied to all the elements in parallel. Specific examples are MMX, SSE2/3 and AVX on Intel processors, or NEON on ARM processors, or AltiVec on PowerPC. It is very low-level and typically only for a few clock cycles. An example might be that, rather than go into a for
loop increasing the brightness of the pixels in an image one-by-one, you load 64 off 8-bit pixels into a single 512-bit wide register and multiply them all up at the same time in one or two clock cycles. SIMD is often implemented for you in high-performance libraries (like OpenCV) or is generated for you by your compiler when you compile with vectorisation enabled, typically at optimisation level 3 or higher (-O3
switch). Very experienced programmers may choose to write their own, using "intrinsics".
Multi-threading refers to when you have multiple threads of execution, normally running on different CPU cores, at the same time. It is higher-level than SIMD and typically threads exist a lot longer. One thread might be acquiring images, another thread might be detecting objects, another might be tracking the objects and a final one might be displaying the results. A feature of multi-threading is that the threads all share the same address space, so data in one thread can be seen and manipulated by others. This makes threads light-weight compared to multiple processes, but can make for harder debugging. Threads are called "light-weight" because they typically take much less time to create and start than full-blown processes.
Multi-processing is similar to multi-threading except each process has its own address space, so if you want to share data between the processes, you need to work harder to do it. It has the benefit over multi-threading that one process is unlikely to crash another or interfere with its data, making it somewhat easier to debug.
If I make an analogy with cooking a meal, then SIMD is like lining up all your green beans and slicing them in one go. The single instruction is "slice", the multiple, repeated data are the beans. In fact, lining things up ("memory alignment") is an important aspect of SIMD.
Then multi-threading is like having multiple chefs all taking ingredients from a shared vegetable larder, preparing them and putting them in a big shared cook-pot. You get the job done faster because there are multiple chefs - analogous to CPU cores - working at once.
In this little analogy, multi-processing is more like each chef having his own vegetable larder and cook-pot, so if one chef runs out of vegetables, or cooking gas, the others are not affected - things are more independent. You get the job done faster, because there are more chefs, just you have to do a bit more organisation (or "synchronisation") to get all the chefs to serve their meals at the same time at the end.
There is nothing to prevent an application using SIMD as well as multi-threading and multi-processing at the same time. Going back to the cooking analogy, you can have multiple chefs (multi-threading or multi-processing) who are all slicing their green beans efficiently (SIMD). It is my impression that most applications either use SIMD and multi-threading, or SIMD and multi-processing, but relatively few use both multi-threading and multi-processing. YMMV on this bit!
© 2022 - 2024 — McMap. All rights reserved.