Python popen() - communicate( str.encode(encoding="utf-8", errors="ignore") ) crashes
Asked Answered
T

3

12

Using Python 3.4.3 on Windows.

My script runs a little java program in console, and should get the ouput:

import subprocess
p1 = subprocess.Popen([ ... ], stdout=subprocess.PIPE, stderr=subprocess.PIPE, universal_newlines=True)
out, err = p1.communicate(str.encode("utf-8"))

This leads to a normal

'UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 135: character maps to < undefined>'.

Now I want to ignore errors:

out, err = p1.communicate(str.encode(encoding="utf-8", errors="ignore"))

This leads to a more interesting error I found no help for using google:

TypeError: descriptor 'encode' of 'str' object needs an argument

So it seems that python does not even know anymore what the arguments for str.encode(...) are. The same also applies when you leave out the errors part.

They answered 22/10, 2015 at 14:33 Comment(1)
https://mcmap.net/q/900551/-python-popen-communicate-str-encode-encoding-quot-utf-8-quot-errors-quot-ignore-quot-crashes hey bro I met this problem too, thanks for your detail answer.Infect
L
25

universal_newlines=True enables text mode. Combined with stdout=PIPE, it forces decoding of the child process' output using locale.getpreferredencoding(False) that is not utf-8 on Windows. That is why you see UnicodeDecodeError.

To read the subprocess' output using utf-8 encoding, drop universal_newlines=True:

#!/usr/bin/env python3
from subprocess import Popen, PIPE

with Popen(r'C:\path\to\program.exe "arg 1" "arg 2"',
           stdout=PIPE, stderr=PIPE) as p:
    output, errors = p.communicate()
lines = output.decode('utf-8').splitlines()

str.encode("utf-8") is equivalent to "utf-8".encode(). There is no point to pass it to .communicate() unless you set stdin=PIPE and the child process expects b'utf-8' bytestring as an input.

str.encode(encoding="utf-8", errors="ignore) has the form klass.method(**kwargs). .encode() method expects self (a string object) that is why you see TypeError.

>>> str.encode("abc", encoding="utf-8", errors="ignore") #XXX don't do it
b'abc'
>>> "abc".encode(encoding="utf-8", errors="ignore")
b'abc'

Do not use klass.method(obj) instead of obj.method() without a good reason.

Liman answered 22/10, 2015 at 21:35 Comment(0)
T
2

You are not supposed to call .encode() on the class itself. What you probably want to do is something like

p1.communicate("FOOBAR".encode("utf-8"))

The error message you're getting means that the encode() function has nothing to encode, since you called it on the class, rather than on an instance (that would then be passed as the self parameter to encode()).

Thirtyeight answered 22/10, 2015 at 15:24 Comment(2)
The problem is: if you use communicate( str.encode("utf-8") ) this works fine (as can be seen on several stackoverflow examples), except some unicode errors. But when you add the error argument or use encoding="utf-8" it breaks. Otherwise it works fine.They
That's because what std.encode("utf-8") really does is the same as "utf-8".encode(), which returns... "utf-8" :) So you're actually sending the text "utf-8" to your app - I don't think that's what you wanted to do.Illfounded
C
0

If using Popen to run another Python script, then see my answer.

The short answer is set the enviornment variable PYTHONIOENCODING, and set encoding='utf-8' in Popen.

Czarra answered 29/11, 2022 at 1:10 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.