Why does my Parallel.ForAll call end up using a single thread?
Asked Answered
D

1

8

I have been using PLINQ recently to perform some data handling.

Basically I have about 4000 time series (so basically instances of Dictionary<DataTime,T>) which I stock in a list called timeSeries.

To perform my operation, I simply do:

timeSeries.AsParallel().ForAll(x=>myOperation(x))

If I have a look at what is happening with my different cores, I notice that first, all my CPUs are being used and I see on the console (where I output some logs) that several time series are processed at the same time.

However, the process is lengthy, and after about 45 minutes, the logging clearly indicates that there is only one thread working. Why is that?

I tried to give it some thought, and I realized that timeSeries contains instances simpler to process from myOperation's point of view at the beginning and the end of the list. So, I wondered if maybe the algorithm that PLINQ was using consisted in splitting the 4000 instances on, say, 4 cores, giving each of them 1000. Then, when the core is finished with its allocation of work, it goes back to idle. This would mean that one of the core may be facing a much heavier workload.

Is my theory correct or is there another possible explanation?

Shall I shuffle my list before running it or is there some kind of parallelism parameters I can use to fix that problem?

Disconcert answered 25/7, 2013 at 7:47 Comment(0)
F
5

Your theory is probably correct although there is something called 'workstealing' that should counter this. I'm not sure why that doesn't work here. Are there many (>= dozens) large jobs at the outer ends or just a few?

Aside from shuffling your data you could use the overload for AsParallel() that accepts a custom Partioner. That would allow you to balance the work better.

Side note: for this situation I would prefer Parallel.ForEach(), more options and cleaner syntax.

Formyl answered 25/7, 2013 at 8:15 Comment(3)
As far as I know, work stealing works on Tasks, not on iterations in PLINQ. If a Task gets a bunch of items from the collection to process, other Tasks won't be able to steal those.Shropshire
Also, a custom partitioner might not be needed here, the ones that are provided by the framework might be enough.Shropshire
@Shropshire - you're probably right, but would that point to a for-loop creating a bunch of Tasks? Seems unwieldy.Formyl

© 2022 - 2024 — McMap. All rights reserved.