Python 2.7: reload(sys) disables error messages and print in Windows

Asked 3/11, 2012 at 14:40 Answered 24/5, 2014 at 16:30

I'm making a script that requires me to change the encoding format to "UTF-8". I found a topic here on Stachoverflow that said i could use:

import sys
reload(sys)
sys.setdefaultencoding('utf-8')

It works great in OSX 10.8 (maybe earlier versions too), but in Windows XP and Windows 7 (probably Vista and 8 too) it disables all feedback in the interpreter. The script still runs, but i can't print anything or see if anything goes wrong.

Is there a way to patch the current code or is there an alternate way to change the encoding?

Handle answered 3/11, 2012 at 14:40 Comment(12)

What do you exactly mean with "disables all feedback"? – Extramundane 3/11, 2012 at 14:45

Might be because cmd.exe doesn't use utf-8 by default? – Pulpit 3/11, 2012 at 14:50

Could you elaborate on I'm making a script that requires me to change the encoding format... - why ? – Electroanalysis 3/11, 2012 at 14:51

@Extramundane I don't get any error messages and print-statements doesn't show anything in the interpreter. – Handle 3/11, 2012 at 14:53

@JakobBowyer And how does your comment help me? – Handle 3/11, 2012 at 14:54

@JonClements I'm importing my schools website (yes i have permission) as a text file, and i parse through it, to find information and index it. It is a danish website and therefore it contains ØÆÅ which doesn't work by default for me. – Handle 3/11, 2012 at 14:56

How do you import a website into a Python script? – Melodist 3/11, 2012 at 15:30

According to the Python developers, M.-A. Lemburg and Martin v. Löwis, changing setdefaultencoding is not a supported way to solve any problem. It will make your Python scripts incompatible with the majority of other Python users, and may lead to unexpected behavior or moji-bake. – Holiday 3/11, 2012 at 15:37

setdefaultencoding affects the way Python does implicit conversion between str and unicode. This could happen in lots of ways so to help you fix your script the proper way, we'd need to see your code. In general, you just have to keep track of what is str and what is unicode and don't mix them willy-nilly. Usually you'd want to convert user-inputted strs to unicode, work everywhere with unicode, and encode your unicode to utf-8 or whatever is appropriate only upon output. – Holiday 3/11, 2012 at 15:44

@Holiday hmm. Is there a good alternative? I most confess i didn't read all the responses in your link. – Handle 3/11, 2012 at 15:48

There is no easy alternative. Python3 will force programmers to pay much closer attention to what is bytes (that is, strs in Python2) and str (or, what is called unicode in Python2). Instead of implicitly converting between the two using the ascii encoding, Python3 will often just raise an exception. So it will pay off in the long run to know the absolute minimum needed to deal with unicode as well as some practical advice on how to deal with unicode in Python. – Holiday 3/11, 2012 at 16:3

I agree with the other commenters. You definitely need to convert your data into unicode object and then work with that. – Extramundane 3/11, 2012 at 17:52

May be what happen to you are related with idle, since idle replace default sys.stdin, sys.stdout, sys.stderr with its own object. After you reload(sys), the three file object associated with sys will be restored to default ones, so you can not see it in idle.

You may solve it by change them back after reload(sys):

import sys
stdin, stdout, stderr = sys.stdin, sys.stdout, sys.stderr
reload(sys)
sys.stdin, sys.stdout, sys.stderr = stdin, stdout, stderr

Kroon answered 24/5, 2014 at 16:30 Comment(1)

Thanks! This hack did sort me another problem where standard stream capture (capsys) was broken in pytest when the tested code was doing a reload(sys). By assuring that we keep the streams on reload this problem was sorted. – Spinney 29/8, 2018 at 8:59

To be frank, I have zero idea why you would possibly want to alter the default encoding for Python just to read and parse a single file (or even a great number of files, for that matter). Python can quite easily parse and handle UTF-8 without such drastic measures. Moreover, on this very site, there are some great methods to do so. This issue is close to a duplicate of: Unicode (UTF-8) reading and writing to files in Python

On that line, the best answer is: https://mcmap.net/q/86083/-unicode-utf-8-reading-and-writing-to-files-in-python, which basically relies on the Python Codecs module.

Using this approach, you can do the following:

import codecs
with codecs.open("SomeFile", "rb", "utf-8") as inFile: 
    text = inFile.read()
# Do something with 'text' here
with codecs.open("DifferentFile", "wb", "utf-8") as outFile:
    outFile.write(text)

This successfully reads a UTF-8 formatted file, then writes it back out as UTF-8. The variable 'text' will be a unicode string in Python. You can always write it back out as UTF-8 or UTF-16 or any compatible output format.

Sutlej answered 14/2, 2013 at 5:56 Comment(0)

Recommended topics

Hot tags