C#: AsParallel - does order matter?
Asked Answered
A

2

16

I'm building a simple LinQ-to-object query which I'd like to parallelize, however I'm wondering if the order of statements matter ?

e.g.

IList<RepeaterItem> items;

var result = items
        .Select(item => item.FindControl("somecontrol"))
        .Where(ctrl => SomeCheck(ctrl))
        .AsParallel();

vs.

var result = items
        .AsParallel()
        .Select(item => item.FindControl("somecontrol"))
        .Where(ctrl => SomeCheck(ctrl));

Would there be any difference ?

Acidhead answered 16/2, 2011 at 9:6 Comment(0)
B
25

Absolutely. In the first case, the projection and filtering will be done in series, and only then will anything be parallelized.

In the second case, both the projection and filtering will happen in parallel.

Unless you have a particular reason to use the first version (e.g. the projection has thread affinity, or some other oddness) you should use the second.

EDIT: Here's some test code. Flawed as many benchmarks are, but the results are reasonably conclusive:

using System;
using System.Diagnostics;
using System.Linq;
using System.Threading;

class Test
{
    static void Main()
    {
        var query = Enumerable.Range(0, 1000)
                              .Select(SlowProjection)
                              .Where(x => x > 10)
                              .AsParallel();
        Stopwatch sw = Stopwatch.StartNew();
        int count = query.Count();
        sw.Stop();
        Console.WriteLine("Count: {0} in {1}ms", count,
                          sw.ElapsedMilliseconds);

        query = Enumerable.Range(0, 1000)
                          .AsParallel()
                          .Select(SlowProjection)
                          .Where(x => x > 10);
        sw = Stopwatch.StartNew();
        count = query.Count();
        sw.Stop();
        Console.WriteLine("Count: {0} in {1}ms", count,
                          sw.ElapsedMilliseconds);
    }

    static int SlowProjection(int input)
    {
        Thread.Sleep(100);
        return input;
    }
}

Results:

Count: 989 in 100183ms
Count: 989 in 13626ms

Now there's a lot of heuristic stuff going on in PFX, but it's pretty obvious that the first result hasn't been parallelized at all, whereas the second has.

Backspin answered 16/2, 2011 at 9:12 Comment(4)
+1 Excellent. I was pretty convinced it had to go first as well (version 2), but it's always better to be certain :-)Acidhead
@Steffen: I'm just running a test to make sure I'm not telling fibs :)Backspin
How do you test it ? I was considering testing it myself, but was unsure how exactly to go about it.Acidhead
Nice code example, and yes the performance difference says it all really.Acidhead
C
2

It does matter and not just in performance. The result of the first and the second queries are not equal. There is solution to have parallel processing and keeping the original order. Use AsParallel().AsOrdered(). Third query shows it.

var SlowProjection = new Func<int, int>((input) => { Thread.Sleep(100); return input; });

var Measure = new Action<string, Func<List<int>>>((title, measure) =>
{
    Stopwatch sw = Stopwatch.StartNew();
    var result = measure();
    sw.Stop();
    Console.Write("{0} Time: {1}, Result: ", title, sw.ElapsedMilliseconds);
    foreach (var entry in result) Console.Write(entry + " ");         
});

Measure("Sequential", () => Enumerable.Range(0, 30)
    .Select(SlowProjection).Where(x => x > 10).ToList());
Measure("Parallel", () => Enumerable.Range(0, 30).AsParallel()
    .Select(SlowProjection).Where(x => x > 10).ToList());
Measure("Ordered", () => Enumerable.Range(0, 30).AsParallel().AsOrdered()
    .Select(SlowProjection).Where(x => x > 10).ToList());

Result:

Sequential Time: 6699, Result: 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
Parallel Time: 1462, Result: 12 16 22 25 29 14 17 21 24 11 15 18 23 26 13 19 20 27 28
Ordered Time: 1357, Result: 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29

I was surprised about that, but the result was consistent after 10+ test run. I investigated a bit and it turned out to be a "bug" in .Net 4.0. In 4.5 AsParallel() is not slower than AsParallel().AsOrdered()

Reference is here:

http://msdn.microsoft.com/en-us/library/dd460677(v=vs.110).aspx

Colyer answered 23/1, 2014 at 9:44 Comment(2)
Can you speak to why .AsOrdered() makes it faster, or is that just a fluke from running it once? I would think maintaining the order would increase execution speed, not decrease it.Dripstone
I was surprised too but the result was consistent across multiple (10+) test runs. I don't know why and I didn't investigate - it wasn't related to the question. :-)Colyer

© 2022 - 2024 — McMap. All rights reserved.