Work stealing is a technique used by modern thread-pools in order to decrease contention on the work queue.
A classical threadpool has one queue, and each thread-pool-thread locks the queue, dequeue a task and then unlocks the queue. If the tasks are short and there are many of them, there is a lot of contention on the queue. Using a lock-free queue really helps here, but doesn't solve the problem entirely.
Modern thread pools use work stealing - each thread has its own queue. When a threadpool thread produces a task - it enqueues it to his own queue. When a threadpool thread wants to dequeue a task - it first tries to dequeue a task out of his own queue and if it doesn't have any - it "steals" work from other thread queues. This really decreases the contention of the threadpool and improves performance.
newWorkStealingPool
creates a workstealing-utilizing thread pool with the number of threads as the number of processors.
newWorkStealingPool
presents a new problem. If I have four logical cores, then the pool will have four threads total. If my tasks block - for example on synchronous IO - I don't utilize my CPUs enough. What I want is four active threads at any given moment, for example - four threads which encrypt AES and another 140 threads which wait for the IO to finish.
This is what ForkJoinPool
provides - if your task spawns new tasks and that task waits for them to finish - the pool will inject new active threads in order to saturate the CPU. It is worth mentioning that ForkJoinPool
utilizes work stealing too.
Which one to use? If you work with the fork-join model or you know your tasks block indefinitely, use the ForkJoinPool
. If your tasks are short and are mostly CPU-bound, use newWorkStealingPool
.
And after anything has being said, modern applications tend to use thread pool with the number of processors available and utilize asynchronous IO and lock-free-containers in order to prevent blocking. this (usually) gives the best performance.