Is there any linter that detects blocking calls in an async function?
Asked Answered
I

1

11

https://www.aeracode.org/2018/02/19/python-async-simplified/

It's not going to ruin your day if you call a non-blocking synchronous function, like this:

def get_chat_id(name):
    return "chat-%s" % name

async def main():
    result = get_chat_id("django")

However, if you call a blocking function, like the Django ORM, the code inside the async function will look identical, but now it's dangerous code that might block the entire event loop as it's not awaiting:

def get_chat_id(name):
    return Chat.objects.get(name=name).id

async def main():
    result = get_chat_id("django")

You can see how it's easy to have a non-blocking function that "accidentally" becomes blocking if a programmer is not super-aware of everything that calls it. This is why I recommend you never call anything synchronous from an async function without doing it safely, or without knowing beforehand it's a non-blocking standard library function, like os.path.join.

So I am looking for a way to automatically catch instances of this mistake. Are there any linters for Python which will report sync function calls from within an async function as a violation?

Can I configure Pylint or Flake8 to do this?

I don't necessarily mind if it catches the first case above too (which is harmless).


Update:

On one level I realise this is a stupid question, as pointed out in Mikhail's answer. What we need is a definition of a "dangerous synchronous function" that the linter should detect.

So for purpose of this question I give the following definition:

A "dangerous synchronous function" is one that performs IO operations. These are the same operations which have to be monkey-patched by gevent, for example, or which have to be wrapped in async functions so that the event loop can context switch.

(I would welcome any refinement of this definition)

Improbability answered 1/5, 2019 at 11:27 Comment(5)
We have defenition of "dangerous synchronous function" - it's an any def defined function that takes much time (>0.05 sec.) to be executed. It doesn't matter if it's due to I/O or due to other reason (like heavy calculations). Anyway none of this variants is detectable before runtime: just imagine code that eval() random string - you can't predict what will happen before literally executing it.Nacre
No, there is a difference between functions which take a long time due to executing code on the cpu and functions which take a long time due to waiting on i/o. It is only the latter we care about here because async approach will not help at all for the formerImprobability
Async approach won't help you with Django ORM at all either unless you fully rewrite it. Am I making sense?Nacre
This question is not about Django ORM. But since the Django ORM a) does I/O and b) is not called via async, then it is an example of the code I would like the linter to catch. In other words - if I mistakenly used Django ORM instead of asyncpg I want to be warned.Improbability
Honestly, I don't think this is a dumb question, this is something that could be super helpful to others if there is indeed a solution. I understand that it's a difficult one to determine though.Coincident
N
5

So I am looking for a way to automatically catch instances of this mistake.

Let's make few things clear: mistake discussed in article is when you call any long running sync function inside some asyncio coroutine (it can be I/O blocking call or just pure CPU function with a lot of calculations). It's a mistake because it'll block whole event loop what will lead to significant performance downgrade (more about it here including comments below answer).

Is there any way to catch this situation automatically? Before run time - no, no one except you can predict if particular function will take 10 seconds or 0.01 second to execute. On run time it's already built-in asyncio, all you have to do is to enable debug mode.

If you afraid some sync function can vary between being long running (detectable in run time in debug mode) and short running (not detectable) just execute function in background thread using run_in_executor - it'll guarantee event loop will not be blocked.

Nacre answered 1/5, 2019 at 18:39 Comment(5)
Such an excellent response to a hard but important question. Love that it shows two different strategies!Occiput
Also, it's an incorrect answer.Brendabrendan
@ErikAronesty and what's the correct answer?Nacre
The correct answer is to use node's blocked-at and also static code analysis can check if common non-async libs are used inside async functions or their dependents... like read_sync. We implemented this ourselves it works well.Brendabrendan
@ErikAronesty the question about Python's asyncio. Can node's blocked-at be used to debug Python code?Nacre

© 2022 - 2024 — McMap. All rights reserved.