How to use an async for loop to iterate over a list?
Asked Answered
I

3

30

So I need to call an async function for all items in a list. This could be a list of URLs and an async function using aiohttp that gets a response back from every URL. Now obviously I cannot do the following:

async for url in ['www.google.com', 'www.youtube.com', 'www.aol.com']:

I can use a normal for loop but then my code will act synchronously and I lose the benefits and speed of having an async response fetching function.

Is there any way I can convert a list such that the above works? I just need to change the list's __iter__() to a __aiter__() method right? Can this be achieved by subclassing a list? Maybe encapsulating it in a class?

Incertitude answered 1/1, 2018 at 18:32 Comment(0)
R
34

Use asyncio.as_completed:

for future in asyncio.as_completed(map(fetch, urls)):
    result = await future

Or asyncio.gather:

results = await asyncio.gather(*map(fetch, urls))

EDIT: If you don't mind having an external dependency, you can use aiostream.stream.map:

from aiostream import stream, pipe

async def fetch_many(urls):
    xs = stream.iterate(urls) | pipe.map(fetch, ordered=True, task_limit=10)
    async for result in xs:
        print(result)

You can control the amount of fetch coroutine running concurrently using the task_limit argument, and choose whether to get the results in order, or as soon as possible.

See more examples in this demonstration and the documentation.

Disclaimer: I am the project maintainer.

Refusal answered 1/1, 2018 at 18:50 Comment(5)
But here you are using map and that is essentially like a normal for loop, applying the function to every element in the list. Why would this be any faster and how is it async? Map is also limited by the number of parameters the fetch function can receive. I was planning on passing an aiohttp session as well.Incertitude
@MaxSmith In this case, map simply creates the coroutines. Then as_completed (or gather, or wait) schedules them so they can run concurrently. An async for loop is not needed, since the asynchronous call is done within the loop (and not in the the iter/next calls)Refusal
Thank you, this seems to work well except that now the program crashes because there are too many open sockets... I think I'm going to need semaphores...Incertitude
@MaxSmith See my edit for a solution to your socket issue.Refusal
@Refusal can you please check here? #73856091Unpeg
K
14

Please note, that Vincents answer has a partial problem:
You must have a splatter operator infront of the map function, otherwise asyncio.gather would try to use the list as whole. So do it like this:

results = await asyncio.gather(*map(fetch, url))
Kipp answered 21/11, 2020 at 12:1 Comment(0)
W
0

The answers above are correct, your computations are what should be asyncd, not the literal for loop itself.


That said, in case you absolutely need to make async for work (convert an Iterable to AsyncIterable, actual benefits or reasoning be damned), the simplest way I've found is to wrap it in an async function (tested with Python 3.12, long after this question was posted):

async def url_aiter():
    for url in ['www.google.com', 'www.youtube.com', 'www.aol.com']:
        yield url

async for url in url_aiter():
    ...

This uses magic conjured by async def to generate what you need, rather than having to explicitly call asyncio.get_running_loop().run_in_executor(...) or whatever, avoiding having to reason too hard about converting synchronous to asynchronous. Which should generally only be done in your program at the highest (main() calling asyncio.run()) or lowest levels (after setting up a computation, actually running it), anyway.

Wadewadell answered 16/5, 2024 at 21:23 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.