AsParallel() executing sequentially
Asked Answered
C

1

6

I have the following PLINQ query:

// Let's get a few customers
List<Customer> customers = CustomerRepository.GetSomeCustomers();

// Let's get all of the items for all of these customers
List<CustomerItem> items = customers
    .AsParallel()
    .SelectMany(x => ItemRepository.GetItemsByCustomer(x))
    .ToList();

I would expect GetItemsByCustomer() to be executed in parallel for each customer, but it runs sequentially.

I have tried to force parallelism but still without luck:

List<CustomerItem> items = customers
    .AsParallel()
    .WithExecutionMode(ParallelExecutionMode.ForceParallelism)
    .SelectMany(x => ItemRepository.GetItemsByCustomer(x))
    .ToList();

The method signature:

private IEnumerable<Item> GetItemsByCustomer(Customer customer)
{
    // Get all items for a customer...
}

According to this article, PLINQ can certainly take the sequential route if it deems fit, but forcing parallelism should still override this.

Note: This above example is purely illustrative - assume customers to be a small list and GetItemsByCustomer to be an expensive method.

Concerted answered 6/10, 2014 at 6:23 Comment(11)
Can you give a complete, self-contained example?Holy
Is your real code same or you're using SelectMany's overload which takes an index as parameter?Burning
@SriramSakthivel: My code is structurally identical to the above.Concerted
How many cores does your environment have? Scheduling.DefaultDegreeOfParallelism defaults to Math.Min(Environment.ProcessorCount, 512).Sophisticated
@nvoigt: Sorry if I haven't been clear. What exactly do you need?Concerted
@davenewza He's asking for a small but complete program to reproduce the problem.Burning
I would like a small, self-contained example, so I can run it on my system. Your code has so many uncertain parts that it's impossible to give you more than a guess. And that is not what this platform is about. For example, maybe your repository is locking and disabling parallel access? We cannot know. You do. Reduce your example to something we can run, maybe you will find the error along the way.Holy
In addition you should also specify how you measured your execution to conclude it's not running in parallel?Brosine
Don't try to use parallelism to speed up slow data access code! It looks like your code tries to execute queries "in parallel" over an ORM's context which typically uses a single connection. This will force all queries to execute sequentially. Instead of trying to execute multiple queries, use your ORM's mechanisms to either batch all request to a single one or create a GetItemsByCustomers method that accepts a list of all IDs you want to use and uses an IN (...) argument in the WHERE clauseFiddlestick
What ORM are you using? Why are you trying to execute multiple queries in parallel?Fiddlestick
Is ItemRepository thread-safe?Finnegan
B
7

There is nothing wrong with AsParallel(). It will run as parallel if possible, and there is no sequential dependency in your LINQ expression, so there is nothing to force it to run sequentially.

A couple of reasons why your code doesn't run in parallel could be:

  1. Your box/vm has a single CPU or you have a .NET setting to limit the parallelism to one CPU. You can Simulate that with this code:

          var customers = new List<Customer>() { new Customer() {Name = "Mick", Surname = "Jagger"}, new Customer() {Name = "George", Surname = "Clooney"},new Customer() {Name = "Kirk", Surname = "DOuglas"}};
    
          var items = customers
            .AsParallel()
            .SelectMany(x =>
            {
                 Console.WriteLine("Requesting: " + x.Name + " - " + DateTime.Now);
                 Thread.Sleep(3000);
                 return new List<CustomerItem>();
    
            })
            .WithDegreeOfParallelism(1)
            .ToList();
    

    Even if you force paralelism with WithExecutionMode(ParallelExecutionMode.ForceParallelism) on a single core/CPU box or when the degree of parallelism is 1, your setting will not have effect, since true parallelism is not possible.

  2. There is some thread locking on shared resources happening in your repository. You can simulate thread locking with the following code:

        var customers = new List<Customer>() { new Customer() {Name = "Mick", Surname = "Jagger"}, new Customer() {Name = "George", Surname = "Clooney"},new Customer() {Name = "Kirk", Surname = "DOuglas"}};
    
        var locker = new object();
    
        // Let's get all of the items for all of these customers
        var items = customers
            .AsParallel()
            .SelectMany(x =>
            {
                lock (locker)
                {
                    Console.WriteLine("Requesting: " + x.Name + " - " + DateTime.Now);
                    Thread.Sleep(3000);
                    return new List<CustomerItem>();
                }
    
            })
            .ToList();
    
  3. There is some Database setting that is forcing the queries/reads to be sequential under certain circumstances, and that could give you an impression that your C# code is not running in parallel, while it actually is.

Brosine answered 6/10, 2014 at 10:48 Comment(1)
So, it came down to a custom method attribute (for caching), which was performing a lock. I completely overlooked the attribute - but your post led me to it. ThanksConcerted

© 2022 - 2024 — McMap. All rights reserved.