Are you using Parallel Extensions? [closed]
Asked Answered
S

5

16

I hope this is not a misuse of stackoverflow; recently I've seen some great questions here on Parallel Extensions, and it got my interest piqued.

My question: Are you using Parallel Extensions, and if so, how?

My name is Stephen Toub and I'm on the Parallel Computing Platform team at Microsoft. We're the group responsible for Parallel Extensions. I'm always interested in hearing about how developers are utilizing Parallel Extensions (e.g. Parallel.For, PLINQ, ConcurrentDictionary, etc.), positive experiences you've had, negative experiences you've had, feature requests for the future, and so on.
If you'd be willing to share such information, please do, either here as a response to this question or to me privately through email at stoub at microsoft dot com.

I'm very much looking forward to hearing from you.

Thanks in advance!

Saturate answered 18/10, 2010 at 14:42 Comment(7)
Since this question can't have a single correct answer, I'm afraid it is not appropriate for StackOverflow. Since you're on the MS team I won't vote to close, but probably other will. You might have better luck on programmers.stackexchange.com and meta.stackoverflow.com. Also you can mark the question as a "Community Wiki" so votes don't count toward/against rep and it's much less likely to get closed.Nuclease
Ok. Thanks, Sam. Apologies for misusing the site. -StephenSaturate
@Stephen Toub, 12 hours and zero votes to close. Looks like people don't mind it. No answers though either.Nuclease
Probably the only ones who have the knowledge to answer don't see this ... I'm gonna still go with vote to close, or migrate to meta or P.SE ... Also, visit the C# chatroom ...Chaisson
this would have done much better tagged .net...Tied
should definitely be community wikiDisaffection
@Stephen Toub someone using it to parallel sockets: #5835255Tippet
P
4

I'm using the TPL for doing nested Parallel.ForEach calls. Because I access dictionaries from these calls I have to use ConcurrentDictionary. Although it's nice, I have a few issues:

  • The delegates inside of ForEach don't do much work so I don't get much parallelism. The system seems to spend most of its time joining threads. It would be nice if there was a way to figure out why it's not getting better concurrency and improve it.

  • The inner ForEach iterations are over ConcurrentDictionary instances, which would cause the system to spend much of its time enumerators for the dictionary if I didn't add an enumerator cache.

  • Many of my ConcurrentDictionary instances are actually sets, but there is no ConcurrentSet so I had to implement my own with a ConcurrentDictionary.

  • ConcurrentDictionary does not support object initialization syntax so I can't say var dict = new ConcurrentDictionary<char, int> { { 'A', 65 } }; which also means I can't assign ConcurrentDictionary literals to class members.

  • There are some places where I have to lookup a key in a ConcurrentDictionary and call an expensive function to create a value if it doesn't exist. It would be nice if there were an overload of GetOrAdd that takes an addValueFactory so that the value can be computed only if the key doesn't exist. This can be simulated with .AddOrUpdate(key, addValueFactory, (k, v) => v) but that adds the overhead of an extra delegate call to every lookup.

Polar answered 13/12, 2010 at 7:55 Comment(2)
I think you may be better off without using nesting. There's no reason to create more tasks than there CPUs -- at least not a lot more. I'd recommend getting rid of the innter Parallel.ForEach.Talc
Joe H: The inner ones actually make the process 5-10% faster.Polar
B
1

I haven't used it extensively yet, but I've definitely kept my ear to its uses and look for opportunities in our code base to put it to use (unfortunately, we're .NET-2.0 bound on many of our projects still for the time being). One little gem I came up with myself was a unique word counter. I think this is the fastest and most concise implementation I can come up with - if someone can make it better, that would be awesomeness:

private static readonly char[] delimiters = { ' ', '.', ',', ';', '\'', '-', ':', '!', '?', '(', ')', '<', '>', '=', '*', '/', '[', ']', '{', '}', '\\', '"', '\r', '\n' };
private static readonly Func<string, string> theWord = Word;
private static readonly Func<IGrouping<string, string>, KeyValuePair<string, int>> theNewWordCount = NewWordCount;
private static readonly Func<KeyValuePair<string, int>, int> theCount = Count;

private static void Main(string[] args)
{
    foreach (var wordCount in File.ReadAllText(args.Length > 0 ? args[0] : @"C:\DEV\CountUniqueWords\CountUniqueWords\Program.cs")
        .Split(delimiters, StringSplitOptions.RemoveEmptyEntries)
        .AsParallel()
        .GroupBy(theWord, StringComparer.OrdinalIgnoreCase)
        .Select(theNewWordCount)
        .OrderByDescending(theCount))
    {
        Console.WriteLine(
            "Word: \""
            + wordCount.Key
            + "\" Count: "
            + wordCount.Value);
    }

    Console.ReadLine();
}

private static string Word(string word)
{
    return word;
}

private static KeyValuePair<string, int> NewWordCount(IGrouping<string, string> wordCount)
{
    return new KeyValuePair<string, int>(
        wordCount.Key,
        wordCount.Count());
}

private static int Count(KeyValuePair<string, int> wordCount)
{
    return wordCount.Value;
}
Bireme answered 13/12, 2010 at 22:44 Comment(4)
lol, are you uncomfortable with lambda expressions? ;) Anyway, I tried this code with "The Lord of the Rings" eBook as an input. It gives roughly the same results with or without AsParallel, so I guess parallel extensions are not really helping in that case... I also wrote a shorter implementation (along the same lines) that's almost twice as fast, so your implementation is probably not "the fastest and most concise implementation one can come up with" ;). Still, it's an interesting exercise...Fortson
Well, let's see it, man!Bireme
@Thomas - You probably didn't get a notification for Jesse's comment. Let this comment serve as that notification.Crisper
@Greg, indeed I didn't get a notification. But anyway, I don't have that code anymore, I had written it in LinqPad and didn't keep it...Fortson
P
0

I have been using it on my project MetaSharp. I have a MSBuild based compile pipeline for DSLs and one stage types is a Many to Many stage. The M:M stage uses .AsParallel.ForAll(...).

Here's the snippet:

protected sealed override IEnumerable<IContext> Process()
{
    if (this.Input.Count() > 1)
    {
        this.Input
            .AsParallel<IContext>()
            .ForAll(this.Process);
    }
    else if (this.Input.Any())
    {
        this.Process(this.Input.Single());
    }

    return this.Input.ToArray();
}
Permanence answered 13/12, 2010 at 7:25 Comment(3)
I'd recommend this.Input.Skip(1).Any(), so that you can stop counting after hitting the second input.Automaton
Meaning: replace this.Input.Count() > 1 with this.Input.Skip(1).Any().Genro
Good call! That is very useful.Permanence
A
0

We don't use it extensively, but it has certainly come in handy.

I was able to reduce the running time of a few of our longer-running unit tests to about 1/3 their original time just by wrapping some of the more time-intensive steps in a Parallel.Invoke() call.

I also love using the parallel libraries for testing thread-safety. I've caught and reported a couple of threading issues with Ninject with code something like this:

var repositoryTypes = from a in CoreAssemblies
                    from t in a.GetTypes()
                    where t.Name.EndsWith("Repository")
                    select t;
repositoryTypes.ToList().AsParallel().ForAll(
    repositoryType => _kernel.Get(repositoryType));

In our actual production code, we use some parallel extensions to run some integration actions that are supposed to run every few minutes, and which consist mostly of pulling data from web services. This takes special advantage of parallelism because of the high latency inherent in web connections, and allows our jobs to all finish running before they're supposed to fire again.

Automaton answered 13/12, 2010 at 23:7 Comment(0)
B
0

I am using a ConcurrentDictionary that stores 100 million+ items. My application uses around 8 GB of memory at that moment. The ConcurrentDictionary then decides it wants to grow on another Add. And it wants to grow a LOT apparently (some internal prima algorithm) as it runs out of memory. This is on x64 with 32GB of memory.

Therefore I would like a boolean to block automatic regrow/rehash of a (concurrent)dictionary. I would then Initialize the dictionary at creation with a fixed set of buckets (this is not the same as a fixed capacity!). And it would become a little slower over time as there are more and more items in a bucket. But this would prevent rehashing and getting out of memory too quickly and unnecessarily.

Brunette answered 26/6, 2012 at 12:41 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.