Note: this answer covers CPython and the asyncio framework. The concepts, however, should apply to other Python implementations as well as other async frameworks.
How do I write a C-function so I can await
on it?
The simplest way to write a C function whose result can be awaited is by having it return an already made awaitable object, such as an asyncio.Future
. Before returning the Future
, the code must arrange for the future's result to be set by some asynchronous mechanism. All of these coroutine-based approaches assume that your program is running under some event loop that knows how to schedule the coroutines.
But returning a future isn't always enough - maybe we'd like to define an object with an arbitrary number of suspension points. Returning a future suspends only once (if the returned future is not complete), resumes once the future is completed, and that's it. An awaitable object equivalent to an async def
that contains more than one await
cannot be implemented by returning a future, it has to implement a protocol that coroutines normally implement. This is somewhat like an iterator implementing a custom __next__
and be used instead of a generator.
Defining a custom awaitable
To define our own awaitable type, we can turn to PEP 492, which specifies exactly which objects can be passed to await
. Other than Python functions defined with async def
, user-defined types can make objects awaitable by defining the __await__
special method, which Python/C maps to the tp_as_async.am_await
part of the PyTypeObject
struct.
What this means is that in Python/C, you must do the following:
- specify a non-NULL value for the
tp_as_async
field of your extension type.
- have its
am_await
member point to a C function that accepts an instance of your type and returns an instance of another extension type that implements the iterator protocol, i.e. defines tp_iter
(trivially defined as PyIter_Self
) and tp_iternext
.
- the iterator's
tp_iternext
must advance the coroutine's state machine. Each non-exceptional return from tp_iternext
corresponds to a suspension, and the final StopIteration
exception signifies the final return from the coroutine. The return value is stored in the value
property of StopIteration
.
For the coroutine to be useful, it must also be able to communicate with the event loop that drives it, so that it can specify when it is to be resumed after it has suspended. Most of coroutines defined by asyncio expect to be running under the asyncio event loop, and internally use asyncio.get_event_loop()
(and/or accept an explicit loop
argument) to obtain its services.
Example coroutine
To illustrate what the Python/C code needs to implement, let's consider simple coroutine expressed as a Python async def
, such as this equivalent of asyncio.sleep()
:
async def my_sleep(n):
loop = asyncio.get_event_loop()
future = loop.create_future()
loop.call_later(n, future.set_result, None)
await future
# we get back here after the timeout has elapsed, and
# immediately return
my_sleep
creates a Future
, arranges for it to complete (its result to become set) in n seconds, and suspends itself until the future completes. The last part uses await
, where await x
means "allow x
to decide whether we will now suspend or keep executing". An incomplete future always decides to suspend, and the asyncio Task
coroutine driver special-cases yielded futures to suspend them indefinitely and connects their completion to resuming the task. Suspension mechanisms of other event loops (curio etc) can differ in details, but the underlying idea is the same: await
is an optional suspension of execution.
__await__()
that returns a generator
To translate this to C, we have to get rid of the magic async def
function definition, as well as of the await
suspension point. Removing the async def
is fairly simple: the equivalent ordinary function simply needs to return an object that implements __await__
:
def my_sleep(n):
return _MySleep(n)
class _MySleep:
def __init__(self, n):
self.n = n
def __await__(self):
return _MySleepIter(self.n)
The __await__
method of the _MySleep
object returned by my_sleep()
will be automatically called by the await
operator to convert an awaitable object (anything passed to await
) to an iterator. This iterator will be used to ask the awaited object whether it chooses to suspend or to provide a value. This is much like how the for o in x
statement calls x.__iter__()
to convert the iterable x
to a concrete iterator.
When the returned iterator chooses to suspend, it simply needs to produce a value. The meaning of the value, if any, will be interpreted by the coroutine driver, typically part of an event loop. When the iterator chooses to stop executing and return from await
, it needs to stop iterating. Using a generator as a convenience iterator implementation, _MySleepIter
would look like this:
def _MySleepIter(n):
loop = asyncio.get_event_loop()
future = loop.create_future()
loop.call_later(n, future.set_result, None)
# yield from future.__await__()
for x in future.__await__():
yield x
As await x
maps to yield from x.__await__()
, our generator must exhaust the iterator returned by future.__await__()
. The iterator returned by Future.__await__
will yield if the future is incomplete, and return the future's result (which we here ignore, but yield from
actually provides) otherwise.
__await__()
that returns a custom iterator
The final obstacle for a C implementation of my_sleep
in C is the use of generator for _MySleepIter
. Fortunately, any generator can be translated to a stateful iterator whose __next__
executes the piece of code up to the next await or return. __next__
implements a state machine version of the generator code, where yield
is expressed by returning a value, and return
by raising StopIteration
. For example:
class _MySleepIter:
def __init__(self, n):
self.n = n
self.state = 0
def __iter__(self): # an iterator has to define __iter__
return self
def __next__(self):
if self.state == 0:
loop = asyncio.get_event_loop()
self.future = loop.create_future()
loop.call_later(self.n, self.future.set_result, None)
self.state = 1
if self.state == 1:
if not self.future.done():
return next(iter(self.future))
self.state = 2
if self.state == 2:
raise StopIteration
raise AssertionError("invalid state")
Translation to C
The above is quite some typing, but it works, and only uses constructs that can be defined with native Python/C functions.
Actually translating the two classes to C quite straightforward, but beyond the scope of this answer.
napi_queue_async_work
for interfacing with the event loop from C? – Dennie