I'm implementing a worker role on Azure which needs to delete blobs from Azure storage. Let's assume my list of blobs has about 10K items.
The simplest synchronous approach would probably be:
Parallel.ForEach(list, x => ((CloudBlob) x).Delete());
Requirements:
I want to implement the same thing asynchronously (on a single thread).
I want to limit the number of concurrent connections to 50 - so I'll do my 10K deletions when only 50 async ones are being performed at the same time. If one deletion completes, a new one can be started.
Solution?
So far, after reading this question and this one, it seems that TPL Dataflow is the way to go.
This is such a simple problem and dataflow seems like an overkill. Is there any simpler alternative?
If not, how would this be implemented using dataflow? As I understand, I need a single action block which performs the async
delete (do I need await
?). When creating my block I should set MaxDegreeOfParallelism
to 50. Then I need to post my 10K blobs from the list to the block and then execute with block.Completion.Wait()
. Is this correct?
Task.WaitAll(tasks)
instead ofawait Task.WhenAll(tasks)
since I'm willing to block my single worker thread until all the 10K deletions are complete – Ole