How do I pass a string into subprocess.Popen (using the stdin argument)?
Asked Answered
R

12

348

If I do the following:

import subprocess
from cStringIO import StringIO
subprocess.Popen(['grep','f'],stdout=subprocess.PIPE,stdin=StringIO('one\ntwo\nthree\nfour\nfive\nsix\n')).communicate()[0]

I get:

Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/build/toolchain/mac32/python-2.4.3/lib/python2.4/subprocess.py", line 533, in __init__
    (p2cread, p2cwrite,
  File "/build/toolchain/mac32/python-2.4.3/lib/python2.4/subprocess.py", line 830, in _get_handles
    p2cread = stdin.fileno()
AttributeError: 'cStringIO.StringI' object has no attribute 'fileno'

Apparently a cStringIO.StringIO object doesn't quack close enough to a file duck to suit subprocess.Popen. How do I work around this?

Refinement answered 2/10, 2008 at 17:25 Comment(3)
Instead of disputing my answer with this being deleted, I'm adding it as a comment... Recommended reading: Doug Hellmann's Python Module of the Week blog post on subprocess.Refinement
the blog post contains multiple errors e.g., the very first code example: call(['ls', '-1'], shell=True) is incorrect. I recommend to read common questions from subprocess' tag description instead. In particular, Why subprocess.Popen doesn't work when args is sequence? explains why call(['ls', '-1'], shell=True) is wrong. I remember leaving comments under the blog post but I don't see them now for some reason.Janssen
For the newer subprocess.run see #48752652Gizzard
J
397

Popen.communicate() documentation:

Note that if you want to send data to the process’s stdin, you need to create the Popen object with stdin=PIPE. Similarly, to get anything other than None in the result tuple, you need to give stdout=PIPE and/or stderr=PIPE too.

Replacing os.popen*

    pipe = os.popen(cmd, 'w', bufsize)
    # ==>
    pipe = Popen(cmd, shell=True, bufsize=bufsize, stdin=PIPE).stdin

Warning Use communicate() rather than stdin.write(), stdout.read() or stderr.read() to avoid deadlocks due to any of the other OS pipe buffers filling up and blocking the child process.

So your example could be written as follows:

from subprocess import Popen, PIPE, STDOUT

p = Popen(['grep', 'f'], stdout=PIPE, stdin=PIPE, stderr=STDOUT)    
grep_stdout = p.communicate(input=b'one\ntwo\nthree\nfour\nfive\nsix\n')[0]
print(grep_stdout.decode())
# -> four
# -> five
# ->

On Python 3.5+ (3.6+ for encoding), you could use subprocess.run, to pass input as a string to an external command and get its exit status, and its output as a string back in one call:

#!/usr/bin/env python3
from subprocess import run, PIPE

p = run(['grep', 'f'], stdout=PIPE,
        input='one\ntwo\nthree\nfour\nfive\nsix\n', encoding='ascii')
print(p.returncode)
# -> 0
print(p.stdout)
# -> four
# -> five
# -> 
Janssen answered 3/10, 2008 at 4:11 Comment(14)
This is NOT a good solution. In particular, you cannot asynchronously process p.stdout.readline output if you do this since you'd have to wait for the entire stdout to arrive. It's is also memory-inefficient.Ministry
@Ministry What's a better solution?Amentia
@Nick T: "better" depends on context. Newton's laws are good for the domain they are applicable but you need special relativity to design GPS. See Non-blocking read on a subprocess.PIPE in python.Janssen
But note the NOTE for communicate: "do not use this method if the data size is large or unlimited"Derwent
Can someone explain what each step of the commands are doing so that they may be applied to other problems?Atone
@Atone subprocess.Popen may be applied to great many problems. You could start with common problems linked in the subprocess' tag description.Janssen
@J.F.Sebastian But when we write, p.communicate(input=b'one\n'). I know we are writing to child process's stdin. But, Is parent process writing thru it's stdout to child process's stdin? Can you please explain like this, in your answer?Squawk
@Squawk print in the code prints to python's stdout. It is unrelated to the passing of input to a subprocess as a bytestring. grep's stdin, stdout, stderr have nothing to do with python's stdin, stdout, stderr in the example. All grep's standard streams are redirected here. .communicate() uses grep's stdin, stdout, stderr in a safe manner (it may use threads, async. io under the hood. It hides the complexity: you just pass a string and it is delivered to the child via pipe for you and the corresponding output is read from another pipe that is connected to grep's stdout and returned).Janssen
@J.F.Sebastian I know that, they are subprocess grep's stdin, stdout, stderr is used, when I mean those descriptors in Popen(grep, stdout, stderr).communicate(b'something'). My question is, How parent process is sending data(b'something') to grep's stdin?Squawk
@Squawk stdin=PIPE in the code creates a pipe. Any data written by python on one end of the pipe can be read by the grep process on the other end (it is connected to grep's stdin, grep just reads from its stdin). python sees its own end of the pipe as a file-like object p.stdin with all the usual methods: .write(), .flush(), .fileno(), .close().Janssen
You need python 3.6 to use the input arg with subprocess.run(). Older versions of python3 work if you do this: p = run(['grep', 'f'], stdout=PIPE, input=some_string.encode('ascii'))Cressida
@TaborKelly: 1- note: you don't need .encode() -- the code use encoding parameter 3- "current Python 3 version" refered to Python 3.6. It is Python 3.7 now.Janssen
Sorry, I have a typo. You need Python 3.6 to use encoding, my example works on Python 3.5. You need python 3.6 to use the encoding arg with subprocess.run(). Which comes in handy since not everyone is running with Python 3.6 yet, for example Debian Stable is on Python 3.5.Cressida
Note that the input parameter of subprocess.run expects a bytes-string unless encoding is specifiedSuperordinate
R
50

I figured out this workaround:

>>> p = subprocess.Popen(['grep','f'],stdout=subprocess.PIPE,stdin=subprocess.PIPE)
>>> p.stdin.write(b'one\ntwo\nthree\nfour\nfive\nsix\n') #expects a bytes type object
>>> p.communicate()[0]
'four\nfive\n'
>>> p.stdin.close()

Is there a better one?

Refinement answered 2/10, 2008 at 17:27 Comment(6)
@Moe: stdin.write() usage is discouraged, p.communicate() should be used. See my answer.Janssen
Per the subprocess documentation: Warning - Use communicate() rather than .stdin.write, .stdout.read or .stderr.read to avoid deadlocks due to any of the other OS pipe buffers filling up and blocking the child process.Norvall
I think this is good way to do it if you're confident that your stdout/err won't ever fill up (for instance, it's going to a file, or another thread is eating it) and you have an unbounded amount of data to be sent to stdin.Emanation
In particular, doing it this way still ensures that stdin is closed, so that if the subprocesses is one that consumes input forever, the communicate will close the pipe and allow the process to end gracefully.Emanation
@Lucretiel, if the process consumes stdin forever, then presumably it can still write stdout forever, so we'd need completely different techniques all-round (can't read() from it, as communicate() does even with no arguments).Hunyadi
@Lucretiel, anyhow, to avoid deadlocks you'd need the p.stdin.write() to be done in a different thread, and this answer doesn't show the necessary techniques. p.stdin.write() may have a place, but its place is not in an answer that's so short and simple as to not demonstrate how to use it safely.Hunyadi
E
42

There's a beautiful solution if you're using Python 3.4 or better. Use the input argument instead of the stdin argument, which accepts a bytes argument:

output_bytes = subprocess.check_output(
    ["sed", "s/foo/bar/"],
    input=b"foo",
)

This works for check_output and run, but not call or check_call for some reason.

In Python 3.7+, you can also add text=True to make check_output take a string as input and return a string (instead of bytes):

output_string = subprocess.check_output(
    ["sed", "s/foo/bar/"],
    input="foo",
    text=True,
)
Eskilstuna answered 8/12, 2016 at 10:4 Comment(3)
@vidstige You're right, that's weird. I would consider filing this as an Python bug, I don't see any good reason in why check_output should have an input argument, but not call.Eskilstuna
This is the best answer for Python 3.4+ (using it in Python 3.6). It indeed does not work with check_call but it works for run. It also works with input=string as long as you pass an encoding argument too according to the documentation.Easting
@Eskilstuna the reason is obvious: run and check_output use communicate under the hood, while call and check_call do not. communicate is much more heavy, since it involves select to work with streams, call and check_call are much simplier and faster.Bluefield
P
32

I'm a bit surprised nobody suggested creating a pipe, which is in my opinion the far simplest way to pass a string to stdin of a subprocess:

read, write = os.pipe()
os.write(write, "stdin input here")
os.close(write)

subprocess.check_call(['your-command'], stdin=read)
Pyles answered 2/11, 2015 at 16:34 Comment(6)
The os and the subprocess documentation both agree that you should prefer the latter over the former. This is a legacy solution which has a (slightly less concise) standard replacement; the accepted answer quotes the pertinent documentation.Hardfavored
I'm not sure that's correct, tripleee. The quoted documentation says why it is hard to use the pipes created by the process, but in this solution it creates a pipe and passes it in. I believe it avoids the potential deadlock problems of managing the pipes after the process has already started.Pyles
os.popen is deprecated in favour of subprocessZaffer
-1: it leads to the deadlock, it may loose data. This functionality is already provided by the subprocess module. Use it instead of reimplementing it poorly (try to write a value that is larger than an OS pipe buffer)Janssen
You deserve the best good man, thank you for the simplest and cleverest solutionAcquiescence
@Hardfavored the implementation of pipes in subprocess module is laughably bad, and is impossible to control. You cannot even get the information about the size of the built-in buffer, not to mention, you cannot tell it what are the read and write ends of the pipe, nor can you change the built-in buffer. In not so many words: subprocess pipes are trash. Don't use them.Menstruate
T
15

I am using python3 and found out that you need to encode your string before you can pass it into stdin:

p = Popen(['grep', 'f'], stdout=PIPE, stdin=PIPE, stderr=PIPE)
out, err = p.communicate(input='one\ntwo\nthree\nfour\nfive\nsix\n'.encode())
print(out)
Trichoid answered 27/7, 2014 at 15:29 Comment(3)
You don't specifically need to encode the input, it just wants a bytes-like object (e.g. b'something'). It will return err and out as bytes also. If you want to avoid this, you can pass universal_newlines=True to Popen. Then it will accept input as str and will return err/out as str also.Smell
But beware, universal_newlines=True will also convert your newlines to match your systemFurtive
If you're using Python 3, see my answer for an even more convenient solution.Eskilstuna
C
12

Apparently a cStringIO.StringIO object doesn't quack close enough to a file duck to suit subprocess.Popen

I'm afraid not. The pipe is a low-level OS concept, so it absolutely requires a file object that is represented by an OS-level file descriptor. Your workaround is the right one.

Chronopher answered 2/10, 2008 at 18:33 Comment(0)
L
11
from subprocess import Popen, PIPE
from tempfile import SpooledTemporaryFile as tempfile
f = tempfile()
f.write('one\ntwo\nthree\nfour\nfive\nsix\n')
f.seek(0)
print Popen(['/bin/grep','f'],stdout=PIPE,stdin=f).stdout.read()
f.close()
Lunatic answered 13/4, 2012 at 3:36 Comment(1)
fyi, tempfile.SpooledTemporaryFile.__doc__ says: Temporary file wrapper, specialized to switch from StringIO to a real file when it exceeds a certain size or when a fileno is needed.Morna
P
7
"""
Ex: Dialog (2-way) with a Popen()
"""

p = subprocess.Popen('Your Command Here',
                 stdout=subprocess.PIPE,
                 stderr=subprocess.STDOUT,
                 stdin=PIPE,
                 shell=True,
                 bufsize=0)
p.stdin.write('START\n')
out = p.stdout.readline()
while out:
  line = out
  line = line.rstrip("\n")

  if "WHATEVER1" in line:
      pr = 1
      p.stdin.write('DO 1\n')
      out = p.stdout.readline()
      continue

  if "WHATEVER2" in line:
      pr = 2
      p.stdin.write('DO 2\n')
      out = p.stdout.readline()
      continue
"""
..........
"""

out = p.stdout.readline()

p.wait()
Psoriasis answered 14/6, 2013 at 13:20 Comment(1)
Because shell=True is so commonly used for no good reason, and this is a popular question, let me point out that there are a lot of situations where Popen(['cmd', 'with', 'args']) is decidedly better than Popen('cmd with args', shell=True) and having the shell break the command and arguments into tokens, but not otherwise providing anything useful, while adding a significant amount of complexity and thus also attack surface.Hardfavored
G
7

On Python 3.7+ do this:

my_data = "whatever you want\nshould match this f"
subprocess.run(["grep", "f"], text=True, input=my_data)

and you'll probably want to add capture_output=True to get the output of running the command as a string.

On older versions of Python, replace text=True with universal_newlines=True:

subprocess.run(["grep", "f"], universal_newlines=True, input=my_data)
Gizzard answered 27/12, 2019 at 4:29 Comment(0)
S
6

Beware that Popen.communicate(input=s)may give you trouble ifsis too big, because apparently the parent process will buffer it before forking the child subprocess, meaning it needs "twice as much" used memory at that point (at least according to the "under the hood" explanation and linked documentation found here). In my particular case,swas a generator that was first fully expanded and only then written tostdin so the parent process was huge right before the child was spawned, and no memory was left to fork it:

File "/opt/local/stow/python-2.7.2/lib/python2.7/subprocess.py", line 1130, in _execute_child self.pid = os.fork() OSError: [Errno 12] Cannot allocate memory

Skinnydip answered 19/5, 2014 at 14:56 Comment(0)
K
3

This is overkill for grep, but through my journeys I've learned about the Linux command expect, and the python library pexpect

  • expect: dialogue with interactive programs
  • pexpect: Python module for spawning child applications; controlling them; and responding to expected patterns in their output.
import pexpect
child = pexpect.spawn('grep f', timeout=10)
child.sendline('text to match')
print(child.before)

Working with interactive shell applications like ftp is trivial with pexpect

import pexpect
child = pexpect.spawn ('ftp ftp.openbsd.org')
child.expect ('Name .*: ')
child.sendline ('anonymous')
child.expect ('Password:')
child.sendline ('[email protected]')
child.expect ('ftp> ')
child.sendline ('ls /pub/OpenBSD/')
child.expect ('ftp> ')
print child.before   # Print the result of the ls command.
child.interact()     # Give control of the child to the user.
Knowledge answered 22/3, 2021 at 21:35 Comment(0)
S
2
p = Popen(['grep', 'f'], stdout=PIPE, stdin=PIPE, stderr=STDOUT)    
p.stdin.write('one\n')
time.sleep(0.5)
p.stdin.write('two\n')
time.sleep(0.5)
p.stdin.write('three\n')
time.sleep(0.5)
testresult = p.communicate()[0]
time.sleep(0.5)
print(testresult)
Sowens answered 9/4, 2009 at 4:39 Comment(1)
NameError: global name 'PIPE' is not definedIy

© 2022 - 2024 — McMap. All rights reserved.