I try to use a ForkJoinPool to parallelize my CPU intensive calculations. My understanding of a ForkJoinPool is, that it continues to work as long as any task is available to be executed. Unfortunately I frequently observed worker threads idling/waiting, thus not all CPU are kept busy. Sometimes I even observed additional worker threads.
I did not expect this, as I strictly tried to use non blocking tasks. My observation is very similar to those of ForkJoinPool seems to waste a thread. After debugging a lot into ForkJoinPool I have a guess:
I used invokeAll() to distribute work over a list of subtasks. After invokeAll() finished to execute the first task itself it starts joining the other ones. This works fine, until the next task to join is on top of the executing queue. Unfortunately I submitted additional tasks asynchronously without joining them. I expected the ForkJoin framework to continue executing those task first and than turn back to joining any remaining tasks.
But it seems not to work this way. Instead the worker thread gets stalled calling wait() until the task waiting for gets ready (presumably executed by an other worker thread). I did not verify this, but it seems to be a general flaw of calling join().
ForkJoinPool provides an asyncMode, but this is a global parameter and can not be used for individual submissions. But I like to see my asynchronously forked tasks to be executed soon.
So, why does ForkJoinTask.doJoin() not simply executes any available task on top of its queue until it gets ready (either executed by itself or stolen by others)?