Asyncio + Aiohttp Memory Leak when running async function in for loop (python)
Asked Answered
M

0

6

I am making a python function which makes a lot of requests to an api. The function works like this:

async def get_one(session, url):
    try:
        with session.get(url) as resp:
            resp = await resp.json()
    except:
        resp = None
    return resp, url

async def get_all(session, urls):
    tasks = [asyncio.create_task(get_one(session, url)) for url in urls]
    results = await asyncio.gather(*tasks)
    return results

async def make_requests(urls):
    timeout = aiohttp.ClientTimeout(sock_read=10, sock_connect=10, total=0.1*len(urls))
    connector = aiohttp.TCPConnector(limit=125)
    async with aiohttp.ClientSession(connector=connector, skip_auto_headers=['User-Agent'], timeout=timeout) as session:
        data = await get_all(session, ids)
        return data

def main(urls):
    results = []

    while urls:
        retry = []
        response = asyncio.run(make_requests(urls))
        for resp, url in response:
            if resp is not None:
                results.append(resp)
            else:
                retry.append(url)
        urls = retry

    return results

The problem is my function keeps building up memory, especially when there are more errors in the try-except block inside the 'get_one' function, the more times I have to retry, the more memory it consumes (something is preventing python from collecting the garbage).

I have come accross an old answer (Asyncio with memory leak (Python)) stating that create_task() is responsible for this (or ensure_future), as it keeps a reference to the original task.

But it is still not clear to me if this is really the case, or how to solve this issue if it is. Any help will appreciated, thank you!

Mosquito answered 13/9, 2022 at 16:28 Comment(3)
Minor typo. I'm assuming if is not None at the end is supposed to be if resp is not None.Theona
This github.com/encode/httpx/issues/978 may help. It seems that there may be a bug in creating SSL contexts. Turn off SSL checking, or creating a single context shared among all your clients.Theona
@FrankYellin You are correct, thanks! Just corrected it. I will try this out, are you sure this is not specific to httpx library?Mosquito

© 2022 - 2024 — McMap. All rights reserved.