I want to submit Runnable
tasks into ForkJoinPool via a method:
forkJoinPool.submit(Runnable task)
Note, I use JDK 7.
Under the hood, they are transformed into ForkJoinTask objects. I know that ForkJoinPool is efficient when a task is split into smaller ones recursively.
Question:
Does work-stealing still work in the ForkJoinPool if there is no recursion?
Is it worth it in this case?
Update 1: Tasks are small and can be unbalanced. Even for strictly equal tasks, such things like context switching, thread scheduling, parking, pages misses etc. get in the way leading to the imbalance.
Update 2: Doug Lea wrote in the Concurrency JSR-166 Interest group, by giving a hint on this:
This also greatly improves throughput when all tasks are async and submitted to the pool rather than forked, which becomes a reasonable way to structure actor frameworks, as well as many plain services that you might otherwise use ThreadPoolExecutor for.
I presume, when it comes to reasonably small CPU-bound tasks, ForkJoinPool is the way to go, thanks to this optimization. The main point is that these tasks are already small and needn't a recursive decomposition. Work-stealing works, regardless whether it is a big or small task - tasks can be grabbed by another free worker from the Deque's tail of a busy worker.
Update 3: Scalability of ForkJoinPool - benchmarking by Akka team of ping-pong shows great results.
Despite this, to apply ForkJoinPool more efficiently requires performance tuning.