Asyncio RuntimeError: Event Loop is Closed
Asked Answered
R

3

22

I'm trying to make a bunch of requests (~1000) using Asyncio and the aiohttp library, but I am running into a problem that I can't find much info on.

When I run this code with 10 urls, it runs just fine. When I run it with 100+ urls, it breaks and gives me RuntimeError: Event loop is closed error.

import asyncio
import aiohttp


@asyncio.coroutine
def get_status(url):
    code = '000'
    try:
        res = yield from asyncio.wait_for(aiohttp.request('GET', url), 4)
        code = res.status
        res.close()
    except Exception as e:
        print(e)
    print(code)


if __name__ == "__main__":
    urls = ['https://google.com/'] * 100
    coros = [asyncio.Task(get_status(url)) for url in urls]
    loop = asyncio.get_event_loop()
    loop.run_until_complete(asyncio.wait(coros))
    loop.close()

The stack trace can be found here.

Any help or insight would be greatly appreciated as I've been banging my head over this for a few hours now. Obviously this would suggest that an event loop has been closed that should still be open, but I don't see how that is possible.

Ratable answered 16/9, 2015 at 1:22 Comment(6)
is not Asyncio error. Python recursive error, reached limit. need thread for all non class function...Tanta
First, make sure you are using the latest aiohttp release. I assume you do. Technically aiohttp need one loop iteration after finishing request for closing underlying sockets. So insert loop.run_until_complete(asyncio.sleep(0)) before loop.close() call.Loftin
Your traceback suggests that a job submitted to an Executor through run_in_executor returned after the loop has been closed. Weirdly enough, aiohttp and asyncio don't use run_in_executor...Southwest
@AndrewSvetlov, thanks for the reply - I tried sleeping before close, but still no dice... any other ideas?Ratable
@Southwest technically they does, DNS resolving is performed by run_in_executor -- but it should be done before finishing get_status tasks.Loftin
For anyone using python's async socket.io, make sure to run await sio.wait() in your main functionRajiv
S
7

You're right, loop.getaddrinfo uses a ThreadPoolExecutor to run socket.getaddrinfo in a thread.

You're using asyncio.wait_for with a timeout, which means res = yield from asyncio.wait_for... will raise a asyncio.TimeoutError after 4 seconds. Then the get_status coroutines return None and the loop stops. If a job finishes after that, it will try to schedule a callback in the event loop and raises an exception since it is already closed.

Southwest answered 16/9, 2015 at 14:26 Comment(5)
Ahh, that makes sense, but this is the only way I have found to implement request timeouts. Do you know of a way that I could timeout without closing the loop?Ratable
@PatrickAllen You might want to increase the number of workers that is 5 by default.Southwest
@PatrickAllen Or use loop._default_executor.shutdown(wait=True) before closing the loop.Southwest
I'll mark this as answered, because this seems to have fixed the original problem. Should I be limiting the max number of connections? It seems that requests are timing out for no apparent reason. Maybe I'm making too many requests too quickly?Ratable
@PatrickAllen Well, 5 worker threads and a thousand of request means you're trying to run 200 socket.getaddrinfo in 4 seconds which seems reasonable to me, even though the number of workers can be increased. You can also give a custom TcpConnector to request in order to specify a connection timeout: connector=aiohttp.TCPConnector(loop=loop, force_close=True, conn_timeout=1)Southwest
L
18

The bug is filed as https://github.com/python/asyncio/issues/258 Stay tuned.

As quick workaround I suggest using custom executor, e.g.

loop = asyncio.get_event_loop()
executor = concurrent.futures.ThreadPoolExecutor(5)
loop.set_default_executor(executor)

Before finishing your program please do

executor.shutdown(wait=True)
loop.close()
Loftin answered 16/9, 2015 at 17:43 Comment(3)
Awesome Andrew, thanks for your help. I didn't realize I was talking to part of the team :). Following this on GHRatable
Changed in version 3.5.3: BaseEventLoop.run_in_executor() no longer configures the max_workers of the thread pool executor it createsAdiabatic
Andrew, can you suggest not "quick workaround" but some robust workaround for Python 3.5 ?Adiabatic
S
7

You're right, loop.getaddrinfo uses a ThreadPoolExecutor to run socket.getaddrinfo in a thread.

You're using asyncio.wait_for with a timeout, which means res = yield from asyncio.wait_for... will raise a asyncio.TimeoutError after 4 seconds. Then the get_status coroutines return None and the loop stops. If a job finishes after that, it will try to schedule a callback in the event loop and raises an exception since it is already closed.

Southwest answered 16/9, 2015 at 14:26 Comment(5)
Ahh, that makes sense, but this is the only way I have found to implement request timeouts. Do you know of a way that I could timeout without closing the loop?Ratable
@PatrickAllen You might want to increase the number of workers that is 5 by default.Southwest
@PatrickAllen Or use loop._default_executor.shutdown(wait=True) before closing the loop.Southwest
I'll mark this as answered, because this seems to have fixed the original problem. Should I be limiting the max number of connections? It seems that requests are timing out for no apparent reason. Maybe I'm making too many requests too quickly?Ratable
@PatrickAllen Well, 5 worker threads and a thousand of request means you're trying to run 200 socket.getaddrinfo in 4 seconds which seems reasonable to me, even though the number of workers can be increased. You can also give a custom TcpConnector to request in order to specify a connection timeout: connector=aiohttp.TCPConnector(loop=loop, force_close=True, conn_timeout=1)Southwest
C
1

This is a bug in the interpreter. Fortunately, it was finally fixed in 3.10.6 so you just need to update your installed Python.

Censurable answered 26/10, 2022 at 18:44 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.