Why is the async CTP performing poorly?
Asked Answered
A

2

6

I don't really understand why await and async don't improve the performance of my code here like they're supposed to.

Though skeptical, I thought the compiler was supposed to rewrite my method so that the downloads were done in parallel... but it seems like that's not actually happening.
(I do realize that await and async do not create separate threads; however, the OS should be doing the downloads in parallal, and calling back my code in the original thread -- shouldn't it?)

Am I using async and await improperly? What is the proper way to use them?

Code:

using System;
using System.Net;
using System.Threading;
using System.Threading.Tasks;

static class Program
{
    static int SumPageSizesSync(string[] uris)
    {
        int total = 0;
        var wc = new WebClient();
        foreach (var uri in uris)
        {
            total += wc.DownloadData(uri).Length;
            Console.WriteLine("Received synchronized data...");
        }
        return total;
    }

    static async Task<int> SumPageSizesAsync(string[] uris)
    {
        int total = 0;
        var wc = new WebClient();
        foreach (var uri in uris)
        {
            var data = await wc.DownloadDataTaskAsync(uri);
            Console.WriteLine("Received async'd CTP data...");
            total += data.Length;
        }
        return total;
    }

    static int SumPageSizesManual(string[] uris)
    {
        int total = 0;
        int remaining = 0;
        foreach (var uri in uris)
        {
            Interlocked.Increment(ref remaining);
            var wc = new WebClient();
            wc.DownloadDataCompleted += (s, e) =>
            {
                Console.WriteLine("Received manually async data...");
                Interlocked.Add(ref total, e.Result.Length);
                Interlocked.Decrement(ref remaining);
            };
            wc.DownloadDataAsync(new Uri(uri));
        }
        while (remaining > 0) { Thread.Sleep(25); }
        return total;
    }

    static void Main(string[] args)
    {
        var uris = new string[]
        {
            // Just found a slow site, to demonstrate the problem :)
            "http://www.europeanchamber.com.cn/view/home",
            "http://www.europeanchamber.com.cn/view/home",
            "http://www.europeanchamber.com.cn/view/home",
            "http://www.europeanchamber.com.cn/view/home",
            "http://www.europeanchamber.com.cn/view/home",
        };
        {
            var start = Environment.TickCount;
            SumPageSizesSync(uris);
            Console.WriteLine("Synchronous: {0} milliseconds", Environment.TickCount - start);
        }
        {
            var start = Environment.TickCount;
            SumPageSizesManual(uris);
            Console.WriteLine("Manual: {0} milliseconds", Environment.TickCount - start);
        }
        {
            var start = Environment.TickCount;
            SumPageSizesAsync(uris).Wait();
            Console.WriteLine("Async CTP: {0} milliseconds", Environment.TickCount - start);
        }
    }
}

Output:

Received synchronized data...
Received synchronized data...
Received synchronized data...
Received synchronized data...
Received synchronized data...
Synchronous: 14336 milliseconds
Received manually async data...
Received manually async data...
Received manually async data...
Received manually async data...
Received manually async data...
Manual: 8627 milliseconds          // Almost twice as fast...
Received async'd CTP data...
Received async'd CTP data...
Received async'd CTP data...
Received async'd CTP data...
Received async'd CTP data...
Async CTP: 13073 milliseconds      // Why so slow??
Ammadis answered 15/1, 2012 at 23:58 Comment(0)
C
10

Chris' answer is almost correct, but introduces a race condition and synchronously blocks on all the tasks.

As a general rule, it's best to not use task continuations if you have await/async available. Also, do not use WaitAny / WaitAll - the async equivalents are WhenAny and WhenAll.

I would write it like this:

static async Task<int> SumPageSizesAsync(IEnumerable<string> uris)
{
  // Start one Task<byte[]> for each download.
  var tasks = uris.Select(uri => new WebClient().DownloadDataTaskAsync(uri));

  // Asynchronously wait for them all to complete.
  var results = await TaskEx.WhenAll(tasks);

  // Calculate the sum.
  return results.Sum(result => result.Length);
}
Colcothar answered 16/1, 2012 at 1:44 Comment(1)
Oh huh, this looks a lot cleaner. =) Thanks!Ammadis
P
1

I may be misreading your code, but it looks like you are launching a background thread to do the async reading and then immediately blocking, waiting for it to complete. Nothing about the 'async' portion of your code is actually async. Try this:

static async Task<int> SumPageSizesAsync(string[] uris)
{
    int total = 0;
    var wc = new WebClient();
    var tasks = new List<Task<byte[]>>();
    foreach (var uri in uris)
    {
        tasks
          .Add(wc.DownloadDataTaskAsync(uri).ContinueWith(() => { total += data.Length;
        }));
    }
    Task.WaitAll(tasks);
    return total;
}

And use it thus:

    {
        var start = Environment.TickCount;
        await SumPageSizesAsync(uris);
        Console.WriteLine("Async CTP: {0} milliseconds", Environment.TickCount - start);
    }

I could be wrong- the async stuff is new and I'm not 100% familiar with it- but the similar timing to the sync version seems to bear me out.

Poulos answered 16/1, 2012 at 0:9 Comment(9)
Hm... so I guess the logical question then is, what's the correct way to do it?Ammadis
Edited to add how I think it it supposed to work. Again, could be totally wrong.Poulos
Thanks! Though I thought the whole point of await was to change the rest of the method into continuation passing style (and the whole point of the async CTP being to eliminate the need for so much plumbing with delegates/lambdas)... so how is this different from e.g. BeginInvoke and whatnot, which we already had since at least .NET 2.0?Ammadis
It is, but in your case your await is inside the loop, so the loop cannot continue until the await statement inside of it returns. At least that's my read. I dont think the await statement enables fork/join parallelism.Poulos
Oooh I see... so await suspends the entire method and goes back to the caller, even though parts of the method might be parallelizable?Ammadis
That's my understanding, yes. It's not as smart as you want it to be.Poulos
Ok I guess that makes sense... but then it brings up the question, what advantage exactly is this page trying to demonstrate in the first place? Can't I just have used the synchronous version?Ammadis
In that case, a single download is occurring, so the whole method can be suspended at that point until the DownloadTaskAsync completes, but at the same point the Task<int> is returned to the caller. The caller can choose to block (or not) on that task, while the download continues in the background. Iterating over a list of URI strings and calling that method for each one would complete almost immediately, but would result in a bunch of Tasks that you'd have to wait on later.Poulos
Hmmm ok so it's just optimizing for the entire set of downloads, not for the individual downloads... thanks for the explanation!Ammadis

© 2022 - 2024 — McMap. All rights reserved.