The other answers here are pretty great, but I think it's helpful to understand some low-level systems programming to get a full appreciation for what coroutines in Python are doing.
Before digging into this in more detail it is helpful to consider how asynchronous dispatch works in a range of languages.
C / C++:
Broadly speaking it works like this:
- All code is specified by functions
- There is no asynchronous version of a function - functions just describe code which is to be executed somehow, how the code is executed (multiple thread, process fork, single thread, synchronously, asynchronously) is not specified by the function itself
- You can create a thing called a
std::future
which represents the result of some function execution which may occur in the future, either asynchronously, as part of the same thread in a std::promise
or in a different thread
- There are additional flexibilities. Not only can a
std::future
be created and resolved by dispatching some code using std::async
, but it can also be resolved by a std::promise
which is a thing which you can "move" between different threads.
- This is a very explicit, and very flexible, way of programming. There are lots of options and possibilities
- There are no event loops. If you want one of those, you either have to write it yourself or compile someone else's library
For more detail consult the concurrency support section of C++ language documentation at cppreference.
Node.js:
Broadly speaking, it's the other extreme:
- The NodeJS runtime (interpreter) is an event loop
async
and await
manage events which are placed on a queue and then later resolved by the event loop
- The runtime contains a few threads for different purposes. Some perform io, and one thread is responsible for running the user code
- It's very structured, but rigid
- It is pretty much the case that all code ends up being async by default - things end up being this way because at some point you probably want to call an asynchronous function. That means you have to
await
it. await
can't be used outside of an async function
so the "chicken-and-egg" problem is fixed by making everything in all levels of the call stack async
. You can of course write non-async functions and just call them like regular code from within an async
function, but what ends up happening is function main()
necessarily ends up being async function main()
.
- BUT this is ok, because unlike Python, NodeJS knows how to call an asynchronous function by default. The event loop is built in, you can just call
main()
, regardless of whether main is an async function
or just a plain old function
.
Python:
- Initially, Python seems a bit weird
- Just like with NodeJS, you can't call an
async
function without await
ing it
- You can't use
await
outside of an async def
- Doesn't this mean, like with NodeJS, that if we have an asynchronous function call somewhere in the call stack, that inevitably the top level function ends up being
async
?
Well - yes it does. What makes Python seem a bit odd is it sits part way between the C++ way of doing things and the NodeJS way of doing things. We don't have an event loop baked in by default, but on the other hand the explicitness of the low level way of doing things in C++ land is hidden or abstracted away from us.
We are left with a language which has async
and await
as keywords by default (baked into the core language) but that we don't seem to be able to use very easily.
If you search for tutorials about Python async
/await
everything talks about event loops and asyncio
.
So what's going on?
Let's discuss Python in a little more detail: Python is a high level language. Everything it has to offer typically wraps some lower level (systems-level) code. Let's consider some examples:
- sockets and network programming
- file io
- threads and multiprocess (fork)
- asynchronous code
There are probably other examples which could be added to this list.
The key point is this: In each case, these high level interfaces wrap some lower level interface, which is typically written in C and talks to the OS.
Without going into too much detail, reading and writing files use the same 'w'
and 'r'
as the C level interface. Network sockets use the same AF_INET
. I add these examples merely to give some indication that there is a very tight similarity between Python and C when it comes to interfacing with the OS.
What Python provides is an interpreted environment layered on top of some lower level systems environment. It provides you with a garbage collector and a memory model based around references to objects which prevents the need for the programmer to think deeply amount manually allocating and freeing memory. (As would be required in a systems level language like C or C++.)
A key point to note is the Python interpreter is a single threaded executor. There isn't that much difference between some interpreted Python code being executed and some lower level compiled language code like C or C++ being executed. They both run in a synchronous way as a single thread.
Given all of the above:
- we should expect coroutines and
async
/ await
to function in pretty much the same way as in some other language like C++
But they don't appear to: at least not if you read the documentation for asyncio
which talks in terms of the higher level concept of event loops. What are these mysterious event loops?
Other answers have already provided part of the answer. Unlike NodeJS where the event loop is baked into the runtime, in Python we have to import it.
- Ok great, so this explains why
async
/ await
looks like the async
/ await
in NodeJS, but an async def main()
can't be called by default in Python, whereas it can be called by default in NodeJS.
- To make things work in Python we need to import and event loop, and run our
async
code using it, or do some manual work to get async code to run by ourselves
Instinctively, we now know we should be able to do this.
- We know Python provides a high level interface around the C code which our operating system is (most likely) written in and so we should be able to do the same things as can be done in C++ (which itself is built on top of C, in exactly the same way as Python is)
- The OS manages execution of program code
So how do we write this C++ code in Python?
// https://en.cppreference.com/w/cpp/thread/future
#include <future>
#include <iostream>
int main() {
// future from an async()
// Note: The latter part is a lambda function: `[] { return 8; }`
std::future<int> f2 =
std::async(
std::launch::async,
[]{ return 8; }
);
f2.wait();
std::cout << "Done: f2.get() = " << f2.get() << std::endl;
}
Here's how we do it.
#!/usr/bin/env python3
async def lambda_function(arg:int) -> int:
print(f'in lambda substitute function: arg={arg}')
return arg + 1
def main():
coroutine = lambda_function(42)
try:
coroutine.send(None)
except StopIteration as stop_exception:
returned_value = stop_exception.value
print(f'returned_value={returned_value}')
# Note: also works, but less explicit:
# print(stop_exception)
if __name__ == '__main__':
main()
Output:
in lambda substitute function: arg=42
returned_value=43
You can see that all of the concepts we have in C++ are there in the Python code too. coroutine.send(None)
does the same thing as f2.wait()
for example.
We can even dig into it a little further with the use of this helper function.
def utility_print_public_attributes(obj:object, name:str) -> None:
print(f'attributes of {name}:')
for d in dir(obj):
if not d.startswith('__'):
print(d)
print(type(getattr(obj, d)))
print(f'')
def baz(arg):
print(f'bar -> arg={arg}')
return 'return value from baz'
baz_coro = baz(42)
print(type(baz_coro))
utility_print_public_attributes(baz_coro, 'baz_coro')
Output:
<class 'coroutine'>
attributes of baz_coro:
close
<class 'builtin_function_or_method'>
cr_await
<class 'NoneType'>
cr_code
<class 'code'>
cr_frame
<class 'frame'>
cr_origin
<class 'NoneType'>
cr_running
<class 'bool'>
cr_suspended
<class 'bool'>
send
<class 'builtin_function_or_method'>
throw
<class 'builtin_function_or_method'>
So we can use baz.close()
, baz.send()
and baz.throw()
to control what the coroutine does.
Similarly, for the returned value:
print(f'launching coro')
try:
baz_coro.send(None)
except StopIteration as stop_iteration:
print(type(stop_iteration))
utility_print_public_attributes(stop_iteration, 'stop_iteration')
print(stop_iteration.value)
print(stop_iteration)
launching coro
bar -> arg=42
<class 'StopIteration'>
attributes of stop_iteration:
add_note
<class 'builtin_function_or_method'>
args
<class 'tuple'>
value
<class 'str'>
with_traceback
<class 'builtin_function_or_method'>
We can access the attributes args
and value
to get the inputs and outputs to the coroutine.
send()
statements – Ecthymaasyncio
? – Meshedasync
await
works in Python. I have the same question. In other languages it is possible to dispatch asynchronous code without importing an event loop runtime. NodeJS has an event loop baked into the interpreter, so everything is async by default. Low level languages like C++ allow you to createstd::future
from astd::promise
or by running an function asynchronously, either via a thread or via a coroutine. – Kavita