Hashed/Sharded ActionBlocks
Asked Answered
M

1

6

I have a constant flow of certain items that I need to process in parallel so I'm using TPL Dataflow. The catch is that the items that share the same key (similar to a Dictionary) should be processed in a FIFO order and not be parallel to each other (they can be parallel to other items with different values).

The work being done is very CPU bound with minimal asynchronous locks so my solution was to create an array of ActionBlock<T>s the size of Environment.ProcessorCount with no parallelism and post to them according to the key's GetHashCode value.

Creation:

_actionBlocks = new ActionBlock<Item>[Environment.ProcessorCount];
for (int i = 0; i < _actionBlocks.Length; i++)
{
    _actionBlocks[i] = new ActionBlock<Item>(_ => ProcessItemAsync(_));
}

Usage:

bool ProcessItem(Key key, Item item)
{
    var actionBlock = _actionBlocks[(uint)key.GetHashCode() % _actionBlocks.Length];
    return actionBlock.Post(item);
}

So, my question is, is this the best solution to my problem? Am I hurting performance/scalability? Am I missing something?

Mauretania answered 9/1, 2014 at 1:25 Comment(5)
I like it. I can't think of another method that wouldn't require storage. I think that as long as you make sure your hash codes are properly distributed, this should be fine.Youngstown
Relying on the value of GetHashCode sounds very weird to me, why do you have it? Is the actual requirement “items that are equal should be processed in FIFO order”?Yetty
@Yetty more like Items with the same key should be processed in FIFO order similar to how you would use a dictionary (doesn't really have to be the same Item type). I'll update the question to make that clearer.Mauretania
@I3arnon How do you know that all threads will have at least comparable amount of work to do? There is a chance that (uint)key.GetHashCode() % _actionBlocks.Length will be badly distributed and some cores won't do anything.Amarillo
@Amarillo That's true. I've made sure the hashes are as evenly divided as i can and through testing i see that is indeed the case. But... this is one of the reasons I've put it out here.Mauretania
Y
3

I think your approach is reasonable, assuming you know the hash codes will be distributed well.

If you want to have a better protection against bad distributions, you could use larger number of ActionBlocks while limiting their total concurrency level by using a single custom TaskScheduler shared by all blocks. You can find such scheduler in ParallelExtensionsExtras or on MSDN.

Yetty answered 9/1, 2014 at 11:51 Comment(5)
How would that solve bad distributions? If i have a "special" hash that get used more than others, how having many ActionBlocks that block each other different than using a % _actionBlocks.Length? The "special" hash in your case will make its queue bigger in relation to the other ones...Mauretania
Yes, it will still be bigger than the others, but it's likely that it will be smaller than with small number of block, because there will be smaller number of collisions with that special hash. For example, if half of all hashes are 0 and the rest is distributed uniformly, then with 2 blocks, 3/4 of all items will go to block 0. But with 4 blocks, it's just 5/8 and with inifinity blocks, it would be 1/2.Yetty
But you would still have only 2 threads. one would handle the 5/8 block and a 1/8 block (6/8 = 3/4) and the other thread would handle the 2 1/8 blocks left (2/8 = 1/4). Am i missing something? I get doing that when you also increase the thread count but this code is very CPU bound and i AFAIK single thread per core is recommended.Mauretania
No, that's not what I meant, you would have 2 threads, but they would be shared across all 4 blocks. This way, the load on the threads should be spread mostly evenly, even when the load on the blocks isn't.Yetty
Oversubscribing is even required to fully utilize the CPU because a random distribution is not an equi-distribution. Some blocks will be underutilized, some over-utilized.Euhemerism

© 2022 - 2024 — McMap. All rights reserved.