I am not much experienced in parallel programming. But, I encounter an interesting situation when trying to run my Julia code in parallel.
@threads for i in 1:THREADS
run(parameters[i], tm)
end
parameter::Vector{Parameters} is a vector of mutable structs and tm is the termination time for each thread. There is no atomic variable. The average numbers of iterations for various values of the THREADS variable are as follows:
THREADS Itertations
1 35087
2 44079
3 50220
4 43701
5 39624
6 38986
7 34625
8 35810
9 29248
10 28075
11 20376
12 27342
The highest number of computations is observed when THREAD=3. My CPU is an Apple M2 Pro 12-core 3480 MHz. I must note that the run(.,.) function contains a genetic algorithm procedure and it creates and deletes too many objects in memory. So I suspect it's a slowdown caused by Julia's garbage collection system. If you have any ideas about why it peaks when THREAD=3, I'd love to know. Thank you.
run
? There is no reason to assume a function scale well in such a case. The GC is indeed often the culprit. It can be also the memory (which does not scale), the CPU (power limitation, frequency scaling, etc.) or the OS scheduling. Julia generally print (an estimation of) the time taken by the GC (though it is not always close to be accurate). Reducing allocations is the key to reduce the overhead of the GC. – Mariomariology