Why does PLINQ use only two threads?
Asked Answered
W

4

7

Say I have an IO-bound task. I'm using WithDegreeOfParallelism = 10 and WithExecution = ForceParallelism mode, but still the query only uses two threads. Why?

I understand PLINQ will usually choose a degree of parallelism equal to my core count, but why does it ignore my specific request for higher parallelism?

static void Main(string[] args)
{
    TestParallel(0.UpTo(8));
}

private static void TestParallel(IEnumerable<int> input)
{
    var timer = new Stopwatch();
    timer.Start();
    var size = input.Count();

    if (input.AsParallel().
        WithDegreeOfParallelism(10).
        WithExecutionMode(ParallelExecutionMode.ForceParallelism).
        Where(IsOdd).Count() != size / 2)
        throw new Exception("Failed to count the odds");

    timer.Stop();
    Console.WriteLine("Tested " + size + " numbers in " + timer.Elapsed.TotalSeconds + " seconds");
}

private static bool IsOdd(int n)
{
    Thread.Sleep(1000);
    return n%2 == 1;
}
Wallack answered 28/11, 2009 at 14:40 Comment(4)
How many processors/cores do you have?Curator
Two. But I specifically stated the degree of parallelism to be 10.Wallack
If you have an I/O bound task and running it on multiple threads in parallel improves the speed then it probably wasn't actually I/O bound in the first place, it was just badly written (sync reads instead of async, for example).Erastian
IO = sockets. If you open multiple sockets to different computers, this can offer a real speedup.Wallack
V
10

PLINQ tries to find the optimal number of threads to perform what you want it to do as quickly as possible, if you only have 2 cores on your cpu, that number is most likely 2. If you had a quad core, you would be more likely to see 4 threads appear, but creating 4 threads on a dual core machine wouldn't really improve performance because only 2 threads could be active at the same time.

Also, with IO-based operations, it is likely that any extra threads would simply block on the first IO operation performed.

Virge answered 28/11, 2009 at 14:45 Comment(5)
Doesn't really answer my question - why does it choose to use two threads even though I specifically request a degree of parallelism = 10 ? (Updated question)Wallack
@ripper234: From the MSDN documentation: "Degree of parallelism is the maximum number of concurrently executing tasks that will be used to process the query". WithDegreeOfParallelism is just a hint that PLINQ should use no more than n threads. msdn.microsoft.com/en-us/library/dd383719%28VS.100%29.aspxCurator
So ... there's no way to effectively use PLINQ for IO-bound tasks?Wallack
@ripper234: The "Introduction to PLINQ" article on MSDN specifically suggests bumping up the degree of parallelism for I/O bound tasks: "In cases where a query is performing a significant amount of non-compute-bound work such as File I/O, it might be beneficial to specify a degree of parallelism greater than the number of cores on the machine". But I suppose that PLINQ itself makes the final decision and if it decides (rightly or wrongly) that increasing the parallelism won't help performance then it won't do it! msdn.microsoft.com/en-us/library/dd997425%28VS.100%29.aspxCurator
This answer appears to be at odds with the quote from this one.Petaloid
S
4

10 is maximum

Sets the degree of parallelism to use in a query. Degree of parallelism is the maximum number of concurrently executing tasks that will be used to process the query.

From here:

MSDN

Stallfeed answered 28/11, 2009 at 14:55 Comment(1)
By default, PLINQ uses all of the processors on the host computer up to a maximum of 64. You can instruct PLINQ to use no more than a specified number of processors by using the WithDegreeOfParallelism(Of TSource) method. msdn.microsoft.com/en-us/library/dd383719.aspxCog
W
2

It appears PLINQ tunes the number of threads. When I wrapped the above code in a while(true) loop, the first two iteration took two seconds to run, but the third and above took only one second. PLINQ understood the cores are idle and upped the number of threads. Impressive!

Wallack answered 28/11, 2009 at 19:45 Comment(1)
Note that for this to happen, you really have to specify WithDegreeOfParallelism, otherwise PLINQ would limit itself to the number of cores on your machine.Wallack
K
0

I would agree to Rory, except IO. Haven't tested with disk IO, but network IO definitively may be more effective with more threads, than there are cores on CPU.

Simple test (it would be more correct to run test with each thread count several times, as network speed isn't constant, but still) to prove that:

    [Test]
    public void TestDownloadThreadsImpactToSpeed()
    {
        var sampleImages = Enumerable.Range(0, 100)
            .Select(x => "url to some quite large file from good server which does not have anti DSS stuff.")
            .ToArray();            

        for (int i = 0; i < 8; i++)
        {
            var start = DateTime.Now;
            var threadCount = (int)Math.Pow(2, i);
            Parallel.For(0, sampleImages.Length - 1, new ParallelOptions {MaxDegreeOfParallelism = threadCount},
                         index =>
                             {
                                 using (var webClient = new WebClient())
                                 {
                                     webClient.DownloadFile(sampleImages[index],
                                                            string.Format(@"c:\test\{0}", index));
                                 }
                             });

            Console.WriteLine("Number of threads: {0}, Seconds: {1}", threadCount, (DateTime.Now - start).TotalSeconds);
        }
    }

Result with 500x500px image from CDN using 8 core machine with SSD was:

Number of threads: 1, Seconds: 25.3904522
Number of threads: 2, Seconds: 10.8986233
Number of threads: 4, Seconds: 9.9325681
Number of threads: 8, Seconds: 3.7352137
Number of threads: 16, Seconds: 3.3071892
Number of threads: 32, Seconds: 3.1421797
Number of threads: 64, Seconds: 3.1161782
Number of threads: 128, Seconds: 3.7272132

Last result has such time i think firstly because we have to download only 100 images :)

Time differences using 8-64 threads isn't that big, but that is on 8 core machine. If it was 2 core machine (cheap enduser notebook), i think forcing to use 8 threads would have more impact, than on 8 core machine forcing to use 64 threads.

Kos answered 21/6, 2012 at 12:49 Comment(3)
Did you average these numbers over, say, 10,000 iterations?Thalassa
I've mentioned, that it would be more correct to run test with each thread count several times. Anyway, point is to force more threads for machines, that have low cpu count, in case you're doing network IO.Kos
It looks like the parallel options are being ignored for >=8. Add some debugging output within parallel body and I believe you will see that only a max of 8 at a time are running and it is throttling.Petaloid

© 2022 - 2024 — McMap. All rights reserved.