What is the difference between asynchronous programming and multithreading?
Asked Answered
R

2

455

I thought that they were basically the same thing — writing programs that split tasks between processors (on machines that have 2+ processors). Then I'm reading this, which says:

Async methods are intended to be non-blocking operations. An await expression in an async method doesn’t block the current thread while the awaited task is running. Instead, the expression signs up the rest of the method as a continuation and returns control to the caller of the async method.

The async and await keywords don't cause additional threads to be created. Async methods don't require multithreading because an async method doesn't run on its own thread. The method runs on the current synchronization context and uses time on the thread only when the method is active. You can use Task.Run to move CPU-bound work to a background thread, but a background thread doesn't help with a process that's just waiting for results to become available.

and I'm wondering whether someone can translate that to English for me. It seems to draw a distinction between asynchronicity (is that a word?) and threading and imply that you can have a program that has asynchronous tasks but no multithreading.

Now I understand the idea of asynchronous tasks such as the example on pg. 467 of Jon Skeet's C# In Depth, Third Edition

async void DisplayWebsiteLength ( object sender, EventArgs e )
{
    label.Text = "Fetching ...";
    using ( HttpClient client = new HttpClient() )
    {
        Task<string> task = client.GetStringAsync("http://csharpindepth.com");
        string text = await task;
        label.Text = text.Length.ToString();
    }
}

The async keyword means "This function, whenever it is called, will not be called in a context in which its completion is required for everything after its call to be called."

In other words, writing it in the middle of some task

int x = 5; 
DisplayWebsiteLength();
double y = Math.Pow((double)x,2000.0);

, since DisplayWebsiteLength() has nothing to do with x or y, will cause DisplayWebsiteLength() to be executed "in the background", like

                processor 1                |      processor 2
-------------------------------------------------------------------
int x = 5;                                 |  DisplayWebsiteLength()
double y = Math.Pow((double)x,2000.0);     |

Obviously that's a stupid example, but am I correct or am I totally confused or what?

(Also, I'm confused about why sender and e aren't ever used in the body of the above function.)

Respectable answered 8/1, 2016 at 15:53 Comment(6)
This is a nice explanation: blog.stephencleary.com/2013/11/there-is-no-thread.htmlGarlaand
sender and e are suggesting this is actually an event handler - pretty much the only place where async void is desirable. Most likely, this is called on a button click or something like that - the result being that this action happens completely asynchronously with respect to the rest of the application. But it's still all on one thread - the UI thread (with a tiny sliver of time on an IOCP thread that posts the callback to the UI thread).Veer
Possible duplicate of Difference between Multithreading and Async program in c#Vibraculum
A very important note on the DisplayWebsiteLength code sample: You should not use HttpClient in a using statement - Under a heavy load, the code can exhaust the number of sockets available resulting in SocketException errors. More info on Improper Instantiation.Brainwork
@JakubLortz I don't know who the article is for really. Not for beginners, since it requires good knowledge about threads, interrupts, CPU-related stuff, etc. Not for advanced users, since for them it's all clear already. I'm sure it will not help anyone understand what it's all about -too high level of abstraction.Creamer
@Creamer I don't consider myself advanced, but I was able to understand and learn what is a DPC and why is it differ from "threading".Novelty
S
1062

Your misunderstanding is extremely common. Many people are taught that multithreading and asynchrony are the same thing, but they are not.

An analogy usually helps. You are cooking in a restaurant. An order comes in for eggs and toast.

  • Synchronous: you cook the eggs, then you cook the toast.
  • Asynchronous, single threaded: you start the eggs cooking and set a timer. You start the toast cooking, and set a timer. While they are both cooking, you clean the kitchen. When the timers go off you take the eggs off the heat and the toast out of the toaster and serve them.
  • Asynchronous, multithreaded: you hire two more cooks, one to cook eggs and one to cook toast. Now you have the problem of coordinating the cooks so that they do not conflict with each other in the kitchen when sharing resources. And you have to pay them.

Now does it make sense that multithreading is only one kind of asynchrony? Threading is about workers; asynchrony is about tasks. In multithreaded workflows you assign tasks to workers. In asynchronous single-threaded workflows you have a graph of tasks where some tasks depend on the results of others; as each task completes it invokes the code that schedules the next task that can run, given the results of the just-completed task. But you (hopefully) only need one worker to perform all the tasks, not one worker per task.

It will help to realize that many tasks are not processor-bound. For processor-bound tasks it makes sense to hire as many workers (threads) as there are processors, assign one task to each worker, assign one processor to each worker, and have each processor do the job of nothing else but computing the result as quickly as possible. But for tasks that are not waiting on a processor, you don't need to assign a worker at all. You just wait for the message to arrive that the result is available and do something else while you're waiting. When that message arrives then you can schedule the continuation of the completed task as the next thing on your to-do list to check off.

So let's look at Jon's example in more detail. What happens?

  • Someone invokes DisplayWebSiteLength. Who? We don't care.
  • It sets a label, creates a client, and asks the client to fetch something. The client returns an object representing the task of fetching something. That task is in progress.
  • Is it in progress on another thread? Probably not. Read Stephen's article on why there is no thread.
  • Now we await the task. What happens? We check to see if the task has completed between the time we created it and we awaited it. If yes, then we fetch the result and keep running. Let's suppose it has not completed. We sign up the remainder of this method as the continuation of that task and return.
  • Now control has returned to the caller. What does it do? Whatever it wants.
  • Now suppose the task completes. How did it do that? Maybe it was running on another thread, or maybe the caller that we just returned to allowed it to run to completion on the current thread. Regardless, we now have a completed task.
  • The completed task asks the correct thread -- again, likely the only thread -- to run the continuation of the task.
  • Control passes immediately back into the method we just left at the point of the await. Now there is a result available so we can assign text and run the rest of the method.

It's just like in my analogy. Someone asks you for a document. You send away in the mail for the document, and keep on doing other work. When it arrives in the mail you are signalled, and when you feel like it, you do the rest of the workflow -- open the envelope, pay the delivery fees, whatever. You don't need to hire another worker to do all that for you.

Stokowski answered 8/1, 2016 at 15:58 Comment(38)
At the hardware level, what does an asynchronous task do, then? It just causes the compiler to think about how to rearrange the program better? Like if you have P(async);Q;, then it will rearrange the tasks to Q;P;, or it will do osmething like A little bit of P; A little bit of Q; A little bit of P; etc. What is the point of either? I don't get it ...Respectable
@user5648283: The hardware is the wrong level to think about tasks. A task is simply an object that (1) represents that a value will become available in the future and (2) can run code (on the correct thread) when that value is available. How any individual task obtains the result in the future is up to it. Some will use special hardware like "disks" and "network cards" to do that; some will use hardware like CPUs.Stokowski
@user5648283: Again, think about my analogy. When someone asks you to cook eggs and toast, you use special hardware -- a stove and a toaster -- and you can clean the kitchen while the hardware is doing its work. If someone asks you for eggs, toast, and an original critique of the last Hobbit movie, you can write your review while the eggs and toast are cooking, but you don't need to use hardware for that.Stokowski
@user5648283: Now as for your question about "rearranging the code", consider this. Suppose you have a method P which has a yield return, and a method Q which does a foreach over the result of P. Step through the code. You'll see that we run a little bit of Q then a little bit of P then a little bit of Q... Do you understand the point of that? await is essentially yield return in fancy dress. Now is it more clear?Stokowski
"Await is essentially yield return in a fancy dress" makes perfect sense! I think I understand everything now.Respectable
@EricLippert And how about Parallel Computing in your cooking example?Anopheles
I am still confused with your analogy "Asynchronous with single thread" above. There is only one cook. It represents a single thread. So what do the toaster and the frying pan correspond to? It seems to me both toaster and frying pan are workers as well, which in turns, 2 additional threads. So it contradicts the assumption of a single thread.Arvind
@ArtificialStupidity Exactly. I was thinking the same, because using toaster and frying pan are different processes which now work in parallel and result in multiple threads which contradicts with the "Asynchronous with single thread"Wheresoever
A better example would be to say, that the cook cooks the eggs, but orders a different company to make the toasts. So the cook has just to look if the toasts has arrived every x minutes and at the rest time continues cooking his eggs. Then you have asynchronous with single threadWheresoever
The toaster is hardware. Hardware doesn't need a thread to service it; disks and network cards and whatnot run at a level far below that of OS threads.Stokowski
Do you think it would be accurate to say, that Asynchronous programming is (generally) more suitable towards problems that are IO bound, whereas Multi-threaded programming is (generally) more suitable towards problems that are CPU bound?Unnecessary
Await is used when you have 1 thread, and a problem that will take longer than the refresh rate of the screen, in systems like blender game engine, this will allow 60 fps while crunching a big task over a few frames, I just used it for pathfinding today!Glyph
@BluePrintRandom: Of course you can use await for a lot more than that, but this is a really great example of something you could use await for on a single thread. If you don't have an idle processor to heat up to do your pathfinding, then assign a budget, break it up into little pieces, and asynchronously execute the workflow on one thread.Stokowski
Wonderful analogy :)Ozzie
@EricLippert the confusion is because in C# technically all async ways of invocation create a thread.November
@ShivprasadKoirala: That is absolutely not true at all. If you believe that, then you have some very false beliefs about asynchrony. The whole point of asynchrony in C# is that it does not create a thread.Stokowski
@EricLippert first take a bow from me , thanks for your writings. BTW I agree asynchrony means no threads but then when i invoke using Async/Await why do i see two threads Main thread and Child thread. I used VS debug threads to see the same. I understand they come from threadpool , but still why two threads ?. I am seriously missing something here. ThanksNovember
@ShivprasadKoirala: An asynchronous method is permitted to use the thread pool to produce asynchrony, but it is by no means required to! Remember what the purpose of asynchrony is: it is to manage high latency operations. If the source of that latency is because the CPU needs to do a few seconds of work, then it makes sense to dedicate a thread to that work, and dedicate a CPU to that thread. But if the high-latency operation is, say, waiting for the network or disk, then it makes no sense to give that a thread! The thread will just sleep!Stokowski
@ShivprasadKoirala: If you have not yet read "There is no thread" by Stephen Cleary, read it now. It explains very clearly why there is no thread that services asynchronous IO requests.Stokowski
@ShivprasadKoirala: Also, suppose there is asynchrony because we are awaiting work that will be done on this thread in the future. Obviously we do not need to create a new thread in order for this thread to do work in the future, any more than you need to hire a cook in order to make your own lunch in the future.Stokowski
@ShivprasadKoirala: Now, there are a lot of badly-written asynchronous functions out there which spin up threads unnecessarily because they use Task.Run when they should not. Perhaps that is why you are seeing a lot of threads come out of the thread pool. But again, those methods are not required to do that. Asynchronous does not mean concurrent. Asynchronous just means that you can do something else while you wait for the result. Concurrency is only one way of achieving that. Again, think about your real life. Do you have to hire a worker every time you wait for something?Stokowski
I did read Stephens article its great .Thanks @EricLippert that sentence helped me a lot....Async means waiting for result and for just waiting why do i need threads. My bad i just sat with cool head with a simple example and just saw async did not create new threads. Also Stephencleary discussion and Nicolas code example helped me to track what wrong i had done in my code. https://mcmap.net/q/81486/-async-await-different-thread-id Thanks for all help.November
OMFG this is the best analogy ever <3Eavesdrop
Great answer, wish you could have write about concurrency and parallelism with the same analogy. It's a bit confusing.Airs
Multi Processing: You build a second kitchen and have the eggs cooked over there.Castlereagh
"And you have to pay them." This is the single most important quote that anyone can ever read about multithreading. And for the uninitiated: the cost is HIGH and includes 3-4 years off your life, whatever that's worth to the reader.Delacourt
Im still not entirely convinced after reading all these comments that concurrency isn't occurring somewhere. That IO bound hardware you're waiting on is executing a task in parallel, just on a different machine or hardware device. It may be IO or CPU bound, but its still a separate task. To me, that still seems like parallelism and concurrency. I get it that the parallelism isn't happening locally, ie: we are not using additional CPUs in the same OS for anything, but parallelism is still happening. The toaster is still toasting, the egg fryer is frying the eggs at the same timePaternity
Maybe it was running on another thread, or maybe the caller that we just returned to allowed it to run to completion on the current thread. this irks me. By default it should run on the current thread right? Otherwise there would be multiple threads.Showthrough
I understand that async doesn't create extra threads. What I don't really understand is conceptually, why async and multithreaded are different things. Fundamentally, when some async code is blocked, its thread has to switch to a different SynchronizationContext and run some other code. Threads have to switch to a different thread context and run some other code. Conceptually, it seems to me like an async task is pretty much the same thing as a thread. What's the real, fundamental difference?Ambrosial
In the given analogy, for multithreaded, "Now you have the problem of coordinating the cooks so that they do not conflict with each other in the kitchen when sharing resources". Well, with async, you have exactly the same problem of coordinating async tasks. And async locks are tricker to implement, so it may even be more tricky than doing multithreaded. C# has some handy built-in language constructs to coordinate async tasks, but why couldn't they be used to coordinate multiple threads instead?Ambrosial
@Jez: Asynchrony and multithreading are conceptually different because "break up my to-do list into smaller tasks and work on any unblocked work item until the whole list is done" is conceptually different than "hire a bunch of workers and assign each of them an item on my list". Multithreading is a common technique for achieving asynchrony.Stokowski
@Jez: Is there a language design in which there are primitives that coordinate both workers and tasks that takes advantage of the similarities in these problem spaces that you point out? Probably! There are lots of opportunities in language design for discovering generalities. Choosing which generalities to bake into your language is a key aspect of the design process.Stokowski
@Jez: An analogy might help. When we were designing C# 3, the generalities we were looking at were the "sorting, filtering, projecting" operators on data structures; LINQ generalizes those across many implementation choices. We could have chosen a much more general solution. SelectMany is the bind operator of the sequence monad; we could have designed the feature to emphasize arbitrary monadic workflows instead of SQL-like operations on sequence-like data.Stokowski
@Jez: We could have also chosen a less general solution; early versions of LINQ were basically a way to embed SQL Server query strings into C#. Finding the right level of generality that meets a user need, and is understandable and teachable, without becoming either vendor-specific or abstract general nonsense is tricky. Undoubtedly there are generalizations of asynchronous workflows; the one we chose to embed in the language was "asynchronous wait". There were other possible choices.Stokowski
If (in principle) async/await could've been implemented as keywords that managed threads instead, then I'm not sure it's right for an analogy to draw some kind of fundamental difference between multithreading and async. "Hire a bunch of workers and assign each of them an item on my list" is what's really happening with async. No, each worker isn't creating a new thread. But it is conceptually creating a separate context in which the task is being run which seems to me just as "real" as a thread context (mutliple threads can run on one processor just as multiple tasks can, with time slicing)Ambrosial
@Jez: sure, if you're going to call all tasks "workers" then there's no difference between tasks and workers. If you call all hands "feet" then I guess we're four-footed animals. I'm not sure where you're going with this.Stokowski
Well, basically, what's the point in using async instead of just using multiple threads. Is it purely down to just worse performance or is there a more conceptual reason?Ambrosial
@Jez: The point of using async is first and foremost making asynchronous workflow code readable. Second, reduce the effort required to write code for workflows where some operations have latencies of more than 30ms.. Third, reduce reliance on expensive multithreading to achieve asynchrony. Threads are insanely expensive. Threads in .NET reserve their entire stacks to page file even if they only use a couple of pages!Stokowski
S
41

In-browser Javascript is a great example of an asynchronous program that has no multithreading.

You don't have to worry about multiple pieces of code touching the same objects at the same time: each function will finish running before any other javascript is allowed to run on the page. (Update: Since this was written, JavaScript has added async functions and generator functions. These functions do not always run to completion before any other javascript is executed: whenever they reach a yield or await keyword, they yield execution to other javascript, and can continue execution later, similar to C#'s async methods.)

However, when doing something like an AJAX request, no code is running at all, so other javascript can respond to things like click events until that request comes back and invokes the callback associated with it. If one of these other event handlers is still running when the AJAX request gets back, its handler won't be called until they're done. There's only one JavaScript "thread" running, even though it's possible for you to effectively pause the thing you were doing until you have the information you need.

In C# applications, the same thing happens any time you're dealing with UI elements--you're only allowed to interact with UI elements when you're on the UI thread. If the user clicked a button, and you wanted to respond by reading a large file from the disk, an inexperienced programmer might make the mistake of reading the file within the click event handler itself, which would cause the application to "freeze" until the file finished loading because it's not allowed to respond to any more clicking, hovering, or any other UI-related events until that thread is freed.

One option programmers might use to avoid this problem is to create a new thread to load the file, and then tell that thread's code that when the file is loaded it needs to run the remaining code on the UI thread again so it can update UI elements based on what it found in the file. Until recently, this approach was very popular because it was what the C# libraries and language made easy, but it's fundamentally more complicated than it has to be.

If you think about what the CPU is doing when it reads a file at the level of the hardware and Operating System, it's basically issuing an instruction to read pieces of data from the disk into memory, and to hit the operating system with an "interrupt" when the read is complete. In other words, reading from disk (or any I/O really) is an inherently asynchronous operation. The concept of a thread waiting for that I/O to complete is an abstraction that the library developers created to make it easier to program against. It's not necessary.

Now, most I/O operations in .NET have a corresponding ...Async() method you can invoke, which returns a Task almost immediately. You can add callbacks to this Task to specify code that you want to have run when the asynchronous operation completes. You can also specify which thread you want that code to run on, and you can provide a token which the asynchronous operation can check from time to time to see if you decided to cancel the asynchronous task, giving it the opportunity to stop its work quickly and gracefully.

Until the async/await keywords were added, C# was much more obvious about how callback code gets invoked, because those callbacks were in the form of delegates that you associated with the task. In order to still give you the benefit of using the ...Async() operation, while avoiding complexity in code, async/await abstracts away the creation of those delegates. But they're still there in the compiled code.

So you can have your UI event handler await an I/O operation, freeing up the UI thread to do other things, and more-or-less automatically returning to the UI thread once you've finished reading the file--without ever having to create a new thread.

Shake answered 8/1, 2016 at 16:1 Comment(6)
There's only one JavaScript "thread" running - no longer true with Web Workers.Apostasy
@oleksii: That's technically true, but I wasn't going to go into that because the Web Workers API itself is asynchronous, and Web Workers aren't allowed to directly impact the javascript values or the DOM on the web page they're invoked from, which means the crucial second paragraph of this answer still holds true. From the programmer's perspective, there's little difference between invoking a Web Worker and invoking an AJAX request.Shake
In-browser Javascript is a great example of an asynchronous program that has no threads Being a little pedantic - there is always at least 1 thread of executionJackfruit
@KejsiStruga: LOL, point taken. Changed "no threads" to "no multithreading"Shake
What's so bad about creating a new thread, though? Is it purely a performance thing? Let's put it another way: suppose performance were not an issue. Now, is there any reason at all to use async programming instead of using multiple threads? "The concept of a thread waiting for that I/O to complete is an abstraction that the library developers created to make it easier to program against.". Easier to program against, indeed. One doesn't need to read long articles about ConfigureAwait or async-safe locks when using synchronous programming.Ambrosial
@Jez: Good question. I can think of a handful of things, but there's a reasonable argument that none of them would justify changing all your code to return Tasks the whole way up your call stack. In many cases it would be reasonable to keep the async/await stuff relegated to the code that works on a dedicated UI thread (if you have one), and turn all your long-running operations into await Task.Run(...)s at that level, rather than refactoring your whole code base. Assuming everything else is thread-safe.Shake

© 2022 - 2024 — McMap. All rights reserved.