Multithreading vs Multiprocessing in julia
Asked Answered
F

1

5

I am new to Julia and have few confusions regarding Multiprocessing and multithreading. Previously many people have asked similar questions in stackoverflow, but I still have confusions. They are as follows:

  1. Can multithreading with @async we can get to use more than one cpu cores at a time? Means do we get real parallel processing by running a multithreaded program on a machine having more than one core in julia?
  2. If first answer is yes, i.e. we get advantage of multiple cores or multiple cpus in multithreading. Then where is the need for Multiprocessing (with using Distributed)?
  3. I have used multithreading in C++ Previously using the STL threads library, can anyone elaborate about its core usage. Can it take advantage of multiple cores?
Frieder answered 16/8, 2022 at 13:52 Comment(1)
Have a look at of #64094828 #59825560 #64489782Vulgus
R
6

This high-level overview may help you:

  • multithread (Threads module)
    • advantages: computationally "cheap" to create (the memory is shared);
    • disadvantages: limited to the number of cores within a CPU, require attention in not overwriting the same memory or doing it at the intended order ("data race"), we can't add threads dynamically in Julia (from within a script), but Julia has to be started with the required - fixed - number of threads, typically the number of cores of your machine;
  • multiprocesses (Distributed module):
    • advantages: unlimited number, can be run in different CPUs of the same machine or different nodes of a cluster, even using SSH on different networks, we can add processes from within our code with addprocs(nToAdd);
    • disadvantages: the memory being copied (each process will have its own memory) are computationally expensive (you need to have a gain higher than the cost on setting a new process) and require attention to select which memory a given process will need to "bring with it" for its functionality.

Aside of that, there is a third level of parallelisation possible in Julia, more at the hardware level within of a single core that exploits the SIMD (single instruction, multiple data) special instructions of modern CPUs with the macro @simd or (from the LoopVectorisation.jl package) @turbo and the massive parallelism provided by some supported GPUs (still using external packages, see JuliaGPU ).

Risibility answered 16/8, 2022 at 14:46 Comment(4)
Nice answer, but a remark about @avx: First of all, it has changed name to @turbo, and secondly, it is not part of Base Julia, but exists in the external package LoopVectorization.jl. Furthermore, parallelism can also be exploited with GPUs or other specialized hardware.Liverpool
Good answer but there is a missing point the OP mentioned by the OP: the use of @async in a multithreaded code. AFAIK Julia as a task scheduler but it is not clear either there is one central scheduler that serialize tasks or one scheduler per thread (I think the second option is the correct one). If the scheduler is centralized, it would mean only 1 core can be used for async tasks.Spawn
multit-htreading - in practice the performance drops over 16 or 32 cores so for larger servers multiprocessing is faster. Multiple processes can share memory when run on one host by using SharedArrays. Multiprocessing in Julia runs in distributed clusters. There are also green threads (@async) good for parallelizing I/O.Vulgus
Is we start Julia with only one thread will @async work? If yes how?Frieder

© 2022 - 2024 — McMap. All rights reserved.