Work/Task Stealing ThreadPoolExecutor
Asked Answered
B

3

8

In my project I am building a Java execution framework that receives work requests from a client. The work (varying size) is broken down in to a set of tasks and then queued up for processing. There are separate queues to process each type of task and each queue is associated with a ThreadPool. The ThreadPools are configured in a way such that the overall performance of the engine is optimal.

This design helps us load balance the requests effectively and large requests don't end up hogging the system resources. However at times the solution becomes ineffective when some of the queues are empty and their respective thread pools sitting idle.

To make this better I was thinking of implementing a work/task stealing technique so that the heavily loaded queue can get help from the other ThreadPools. However this may require implementing my own Executor as Java doesn't allow multiple queues to be associated with a ThreadPool and doesn't support the work stealing concept.

Read about Fork/Join but that doesn't seem like a fit for my needs. Any suggestions or alternative way to build this solution could be very helpful.

Thanks Andy

Box answered 14/4, 2012 at 12:32 Comment(6)
You should think about how to keep all your CPUs busy. It doesn't matter if some of your threads are idle if you are making best use of your CPUs.Pinkney
If your thread pools have as many threads as you have cpus, any individual thread pool can "steal" all the cpus even if all the other thread pools are idle.Pinkney
@PeterLawrey - that's true, but if there are a lot of pools, then you may have poor performance if all threads in all pools are working at the same time.Natural
Each task is dependent on large amount of reference data to process and hence the CPU isn't busy at all times. Moreover the historical request queue is the one I have trouble with. These are considered low priority requests but when capacity is available we intent to process them in full swing. With a total of 35 threads processing at peak times the box with 16 cores is able to go only as high as 65% so I believe the CPU is okay. The throughput is what I want to increase when the load pattern is off.Box
@Natural in which case, you are better off having less pools. I usually have one or two. ;)Pinkney
@AnandNadar It sounds like you are actually trying to manage another resource i.e. the database indirectly using threads.Pinkney
A
4

Executors.newWorkStealingPool

Java 8 has factory and utility methods for that in the Executors class: Executors.newWorkStealingPool

That is an implementation of a work-stealing thread pool, I believe, is exactly what you want.

Anomalous answered 20/2, 2016 at 14:37 Comment(1)
Only disadvantage I see with this is that it creates new ForkJoinThreads on demand instead of borrowing these threads from a global pool - may be a common pool or a pool that the client can pass.Wholly
W
2

Have you considered the ForkJoinPool? The fork-join framework was implemented in a nice modular fashion so you can just use the work-stealing thread pool.

Welterweight answered 14/4, 2012 at 13:2 Comment(2)
Read the API but still can't figure out how its different than the regular ThreadPoolExecutor. Perhaps missing some finer aspects there.Box
Yes I see, what you have is in fact a partitioning scheme which you now want to make flexible -- let the partition boundaries shift according to the workload. "Work stealing" may be a more specialized term for schemes that involve fine task granulation -- a task executing on one thread generates subtasks and pushes them to its own deque so other threads can steal its work. So maybe if you do research by the term "thread pool partitioning" you'll find something suited for your case.Welterweight
N
1

you could implement a custom BlockingQueue implementation (i think you mainly need to implement the offer() and take() methods) which is backed by a "primary" queue and 0 or more secondary queues. take would always take from the primary backing queue if non-empty, otherwise it can pull from the secondary queues.

in fact, it may be better to have 1 pool where all workers have access to all the queues, but "prefer" a specific queue. you can come up with your optimal work ratio by assigning different priorities to different workers. in a fully loaded system, your workers should be working at the optimal ratio. in an underloaded system, your workers should be able to help out with other queues.

Natural answered 14/4, 2012 at 12:50 Comment(1)
This seems like a good idea which I am leaning to try with a POC.Box

© 2022 - 2024 — McMap. All rights reserved.