Parallelism with @sync @async in Julia

@sync @async do not do anything in your code other than introducing a begin... end block with its local scope.

What happens here is that you are creating a new scope and never modify the global values of df1 and df2 - rather than that you are seeing their old values.

If I/O is the bottleneck in your code the correct code would be the following:

dfs = Vector{DataFrame}(undef, 2)
@sync begin
    @async dfs[1]=CSV.File(libname*"df1.csv")|> DataFrame!
    @async dfs[2]=CSV.File(libname*"df2.csv")|> DataFrame!
end

However, usually it is not the I/O that is the issue but rather the CPU. In that case green threads are not that much useful and you need normal regular threads:

dfs = Vector{DataFrame}(undef, 2)
Threads.@threads for i in 1:2
    dfs[i]=CSV.File(libname*"df$i.csv")|> DataFrame!
end

Note that for this code to use multi-threading you need to set the JULIA_NUM_THREADS system variable before running Julia such as:

set JULIA_NUM_THREADS=2

Recommended topics

Hot tags