How to read from /dev/stdin with asyncio.create_subprocess_exec()
Asked Answered
G

1

6

Backround

I am calling an executable from Python and need to pass a variable to the executable. The executable however expects a file and does not read from stdin.

I circumvented that problem previously when using the subprocess module by simply calling the executable to read from /dev/stdin along the lines of:

# with executable 'foo'
cmd = ['foo', '/dev/stdin']
input_variable = 'bar'

with subprocess.Popen(
    cmd,
    stdin=subprocess.PIPE,
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE,
    ) as process:
    stdout, stderr = process.communicate(input_variable)

    print(f"{process.returncode}, {stdout}, {stderr}")

This worked fine so far. In order to add concurrency, I am now implementing asyncio and as such need to replace the subprocess module with the asyncio subprocess module.

Problem

Calling asyncio subprocess for a program using /dev/stdin fails. Using the following async function:

import asyncio

async def invoke_subprocess(cmd, args, input_variable):
    process = await asyncio.create_subprocess_exec(
        cmd,
        args,
        stdout=asyncio.subprocess.PIPE,
        stderr=asyncio.subprocess.PIPE,
        stdin=asyncio.subprocess.PIPE,
    )

    stdout, stderr = await process.communicate(input=bytes(input_variable, 'utf-8'))

    print(f"{process.returncode}, {stdout.decode()}, {stderr.decode()}")

This generally works for files, but fails for /dev/stdin:

# 'cat' can be used for 'foo' to test the behavior
asyncio.run(invoke_subprocess('foo', '/path/to/file/containing/bar', 'not used')) # works
asyncio.run(invoke_subprocess('foo', '/dev/stdin', 'bar')) # fails with "No such device or address"

How can I call asyncio.create_subprocess_exec on /dev/stdin?

Note: I have already tried and failed via asyncio.create_subprocess_shell and writing a temporary file is not an option as the file system is readonly.

Minimal example using 'cat'

Script main.py:

import subprocess
import asyncio

def invoke_subprocess(cmd, arg, input_variable):
    with subprocess.Popen(
        [cmd, arg],
        stdin=subprocess.PIPE,
        stdout=subprocess.PIPE,
        stderr=subprocess.PIPE,
        ) as process:
        stdout, stderr = process.communicate(input_variable)

        print(f"{process.returncode}, {stdout}, {stderr}")


async def invoke_async_subprocess(cmd, arg, input_variable):
    process = await asyncio.create_subprocess_exec(
        cmd,
        arg,
        stdout=asyncio.subprocess.PIPE,
        stderr=asyncio.subprocess.PIPE,
        stdin=asyncio.subprocess.PIPE,
    )

    stdout, stderr = await process.communicate(input=input_variable)

    print(f"{process.returncode}, {stdout.decode()}, {stderr.decode()}")


cmd = 'cat'
arg = '/dev/stdin'
input_variable = b'hello world'

# normal subprocess
invoke_subprocess(cmd, arg, input_variable)
asyncio.run(invoke_async_subprocess(cmd, arg, input_variable))

Returns:

> python3 main.py
0, b'hello world', b''
1, , cat: /dev/stdin: No such device or address

Tested on:

  • Ubuntu 21.10, Python 3.9.7
  • Linux Mint 20.2, Python 3.8.10
  • Docker image: python:3-alpine
Gannes answered 7/1, 2022 at 16:36 Comment(14)
Are you sure you're running the exact same program in both cases? Whether the program can access /dev/stdin should bear no relation to how it was started.Glynn
@Glynn 'foo' is the exact same program. I was also surprised and wondered whether asyncio.create_subprocess_exec somehow forbids accessing /dev/stdin or runs for some magical reason with different permissionsGannes
It doesn't. But you can run your program with strace -f to check how it is being executed and (hopefully) what goes wrong. The "no such device or address" error is unusual, to say the least.Glynn
Will do. Thanks for pointing me into that direction. However, the same problem occurs when using good old cat as a program as well which indicates that this may not be related to the program itself.Gannes
Sure, strace -f is primarily to check how asyncio is execing the program, especially when compared to what the regular subprocess module is doing. They are supposed to be doing the same thing.Glynn
BTW can you create a minimal example that reproduces the issue (with cat)? If so, please edit the question to include it. Also, please state whether this is reproducible on multiple machines, or just on a machine with a particular kind of setup. In either case, please specify the type and version of the operating system you're testing it on.Glynn
Good point. Added the info.Gannes
Thanks for providing a minimal example, I can indeed reproduce this on my machine! strace confirms that cat is invoked correctly but (in the asyncio case) gets ENXIO when opening /dev/stdin. It seems the result of an implementation detail of asyncio subprocess vs regular subprocess is interfering with this. To see the difference, change cmd, arg to something like "ls", "-l", "/self/proc/fd/0". For subprocess you'll get something like /proc/self/fd/0 -> pipe:[7344202], whereas for asyncio you'll get /proc/self/fd/0 -> socket:[7339428].Glynn
So asyncio uses socket.socketpair() to communicate with the subprocess, whereas subprocess uses a pipe. Asyncio claims that "not all platforms support selecting read events on the write end of a pipe", naming AIX in particular. This seemingly inoccuous change breaks re-opening of /dev/stdin, which works with a pipe, but doesn't work with a socket. Bummer.Glynn
I think this should be reported as a bug on bugs.python.org, but I wouldn't hold my breath as to when it will be fixed (if at all).Glynn
Note that this also means that programs which implement special treatment for the - filename (esp. GNU tools like cat) will be able to use stdin in this case. It will depend on the implementation of the OP's foo program whether that is a useful workaround or not.Docia
Thanks for the thorough analysis. I will report it to asyncio and see whether that is picked up.Gannes
FYI bugs.python.org/issue46364 . Feel free to extend on my descriptionGannes
Related Python PR: github.com/python/cpython/pull/30596Gannes
G
2

I'll briefly wrap up the question and summarize the outcome of the discussion.

In short: The problem is related to a bug in Python's asyncio library that has been fixed by now. It should no longer occur in upcoming versions.

Bug Details: In contrast to the Python subprocess library, asyncio uses a socket.socketpair() and not a pipe to communicate with the subprocess. This was introduced in order to support the AIX platform. However, it breaks when re-opening /dev/stdin that doesn't work with a socket. It was fixed by only using sockets on AIX platform.

Gannes answered 14/10, 2022 at 8:8 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.