If you have python version >= 3.7, then you should not need to do anything. If you have python 3.6 see the original solution.
EDIT 2017-12-08
I've seen that there is a PEP 538 for py3.7, that will change the entire behavior of python3 encoding management during startup, I think that the new approach will fix the original problem: https://www.python.org/dev/peps/pep-0538/
IMHO the changes targeted to python 3.7 for encoding issues, should have been planed years ago, but better late than never, I guess.
EDIT 2015-09-01
There is an opened issue (enhancement), http://bugs.python.org/issue15216, that will allow to change the encoding in a created (not-used) stream easily (sys.std*). But is targeted to python 3.7 So, we'll have to wait for a while.
Original solution that targets python version 3.6
NOTE: this solution should not be needed for anyone running python version >= 3.7 see PEP 538
Well, my initial workaround had many flaws, I got to pass the click
library check about the encoding, but the encoding itself was not fixed, so I get exceptions when the input parameters or output had non-ascii characters.
I had to implement a more complex method, with 3 steps: set locale, correct encoding in std in/out and re-encode the command line parameters, besides I've added a "friendly" exit if the first try to set the locale doesn't work as expected:
def prevent_ascii_env():
"""
To avoid issues reading unicode chars from stdin or writing to stdout, we need to ensure that the
python3 runtime is correctly configured, if not, we try to force to utf-8,
but It isn't possible then we exit with a more friendly message that the original one.
"""
import locale, codecs, os, sys
# locale.getpreferredencoding() == 'ANSI_X3.4-1968'
if codecs.lookup(locale.getpreferredencoding()).name == 'ascii':
os.environ['LANG'] = 'en_US.utf-8'
if codecs.lookup(locale.getpreferredencoding()).name == 'ascii':
print("The current locale is not correctly configured in your system")
print("Please set the LANG env variable to the proper value before to call this script")
sys.exit(-1)
#Once we have the proper locale.getpreferredencoding() We can change current stdin/out streams
_, encoding = locale.getdefaultlocale()
import io
sys.stderr = io.TextIOWrapper(sys.stderr.detach(), encoding=encoding, errors="replace", line_buffering=True)
sys.stdout = io.TextIOWrapper(sys.stdout.detach(), encoding=encoding, errors="replace", line_buffering=True)
sys.stdin = io.TextIOWrapper(sys.stdin.detach(), encoding=encoding, errors="replace", line_buffering=True)
# And finally we need to re-encode the input parameters
for i, p in enumerate(sys.argv):
sys.argv[i] = os.fsencode(p).decode()
This patch solves almost all issues, however it has a caveat, the method shutils.get_terminal_size()
raises a ValueError
because the sys.__stdout__
has been detached, click
lib uses that method to print the help, to fix it I had to apply a monkey-patch on click
lib
def wrapper_get_terminal_size():
"""
Replace the original function termui.get_terminal_size (click lib) by a new one
that uses a fallback if ValueError exception has been raised
"""
from click import termui, formatting
old_get_term_size = termui.get_terminal_size
def _wrapped_get_terminal_size():
try:
return old_get_term_size()
except ValueError:
import os
sz = os.get_terminal_size()
return sz.columns, sz.lines
termui.get_terminal_size = _wrapped_get_terminal_size
formatting.get_terminal_size = _wrapped_get_terminal_size
With this changes all my scripts work fine now when the environment has a wrong locale configured but the system supports en_US.utf-8 (It's the Fedora default locale).
If you find any issue on this approach or have a better solution, please add a new answer.
LANG
,LC_ALL
)" --> read PEP 538 and the related PEP 540. The error appears to only be an issue for python 3.0 to 3.6 because PEP 538 fixes the issues for python >= 3.7. – Pickings