How can I limit Parallel.ForEach?
Asked Answered
V

5

377

I have a Parallel.ForEach() async loop with which I download some webpages. My bandwidth is limited so I can download only x pages per time but Parallel.ForEach executes whole list of desired webpages.

Is there a way to limit thread number or any other limiter while running Parallel.ForEach?

Demo code:

Parallel.ForEach(listOfWebpages, webpage => {
  Download(webpage);
});

The real task has nothing to do with webpages, so creative web crawling solutions won't help.

Vagrant answered 15/2, 2012 at 9:8 Comment(10)
@jKlaus If the list isn't modified e.g. it's just a set of URLs, I can't really see the issue?Hemocyte
@Shiv, given enough time you will... Count your number of executions and compare it to the count of the list.Dost
@Dost What are you saying will go wrong?Hemocyte
@Shiv, execute this a few times.. dotnetfiddle.net/maKiI5Dost
@Dost you are modifying a non-threadsafe element (the integer). I would expect it to not work in that scenario. The OP on the other hand is not modifying anything that needs to be threadsafe.Hemocyte
@Shiv, Are you positive? I haven't seen the source code for Download().Dost
@Dost Yes Download() has no reference to listOfWebpagesHemocyte
@Dost Here is an example of Parallel.ForEach that sets the count correctly > dotnetfiddle.net/moqP2C. MSDN Link: msdn.microsoft.com/en-us/library/dd997393(v=vs.110).aspxShowthrough
@Dost - so... you should delete your comments / this whole chain is misleading... what you initially pointed out is not actually a problem with the above code, since he's passing the single current loop item to the method. There's no sharing of variables between threads/loop-executions.Diptych
Parallel.ForEach is not suitable for throttling I/O operations. Look at this question for proper solutions: How to limit the amount of concurrent async I/O operations?Upper
D
704

You can specify a MaxDegreeOfParallelism in a ParallelOptions parameter:

Parallel.ForEach(
    listOfWebpages,
    new ParallelOptions { MaxDegreeOfParallelism = 4 },
    webpage => { Download(webpage); }
);

MSDN: Parallel.ForEach

MSDN: ParallelOptions.MaxDegreeOfParallelism

Discant answered 15/2, 2012 at 9:11 Comment(5)
It may not apply to this particular case but I figured I'd throw it out in case anyone wonders across this and finds it useful. Here I am utilizing 75% (rounded up) of the processor count. var opts = new ParallelOptions { MaxDegreeOfParallelism = Convert.ToInt32(Math.Ceiling((Environment.ProcessorCount * 0.75) * 1.0)) };Dost
Just to save anyone else having to look it up in the documentation, passing a value of -1 is the same as not specifying it at all: "If [the value] is -1, there is no limit on the number of concurrently running operations"Dorty
It's not clear to me from documentation - does setting MaxDegreeOfParallelism to 4 (for instance) mean there'll be 4 threads each running 1/4th of the loop iterations (one round of 4 threads dispatched), or does each thread still do one loop iteration and we're just limiting how many run in parallel?Triacid
To be clear cores and threads are not the same thing. Depending on the CPU, there are a different number of threads per core, usually 2 per core. For example, if you have a 4 core CPU with 2 threads per core, then you have a max of 8 threads. To adjust @Dost comment var opts = new ParallelOptions { MaxDegreeOfParallelism = Convert.ToInt32(Math.Ceiling((Environment.ProcessorCount * 0.75) * 2.0)) };. Link to threads vs cores - askubuntu.com/questions/668538/…Ann
@Ann I think Environment.ProcessorCount is the number of logical processors on the machine, not the cores; as you described 4 cores with, typically 2 threads, would be 8 threads/logical processors max. On such a device, Environment.ProcessCount would be 8. In the jKlaus example, he is accurately getting the 75% of max thread count (6 of 8), with your example, you end up with 150% of max thread count (12 of 8).Heroic
K
62

You can use ParallelOptions and set MaxDegreeOfParallelism to limit the number of concurrent threads:

Parallel.ForEach(
    listOfwebpages, 
    new ParallelOptions{MaxDegreeOfParallelism=2}, 
    webpage => {Download(webpage);});     
Kevakevan answered 15/2, 2012 at 9:11 Comment(0)
Q
25

Use another overload of Parallel.Foreach that takes a ParallelOptions instance, and set MaxDegreeOfParallelism to limit how many instances execute in parallel.

Questa answered 15/2, 2012 at 9:12 Comment(0)
M
17

And for the VB.net users (syntax is weird and difficult to find)...

Parallel.ForEach(listOfWebpages, New ParallelOptions() With {.MaxDegreeOfParallelism = 8}, Sub(webpage)
......end sub)  
Mistrot answered 16/8, 2016 at 18:18 Comment(0)
F
5

I think the more dynamic and realistic approach would be to limit it by the processor count, so on each system it would function properly:

var options = new ParallelOptions { MaxDegreeOfParallelism = Environment.ProcessorCount };
Parallel.ForEach(myList, options, iter => { });

perhpas yu would multiply Environment.ProcessorCount or divide it to put or take more pressure from the CPU

Foveola answered 9/6, 2023 at 13:44 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.