os.stat()
translates to a stat
syscall:
$ strace python3 -c 'import os; os.stat("/")'
[...]
stat("/", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
[...]
which is blocking, and there's no way to get a non-blocking stat
syscall.
asyncio
provides non-blocking I/O by using non-blocking system calls, which already exists (see man fcntl
, with its O_NONBLOCK
flag, or ioctl
), so asyncio
is not making syscalls asynchronous, it exposes already asynchronous syscalls in a nice way.
It's still possible to use the nice ThreadPoolExecutor abstraction to make your blocking stat
calls in parallel using a pool of threads.
But you may first consider some other parameters:
- According to
strace -T
, stat
is fast: stat("/", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 <0.000007>
, probably faster than starting and synchronizing threads.
stat
is probably in much cases IO bound, so using more CPUs won't help
- Doing parallel I/O may break a nice sequential access to a random access, phisical hard drive may be slower in this context.
But there's also a lot of possibilities for your stat
s to be faster using a thread pool, like if you're hitting a distributed file system.
You may also take a look at functools.lru_cache
: if you're doing multiple stat
on the same file or directory, and you're sure it has not changed, caching the result avoids a syscall.
To conclude, "keep it simple", "os.stat" is the efficient way to get a filesize.
os.stat()
so other coroutines can run during it? – Mouse