I am trying to read a CSV file with Python with the following code:
with open("example.txt") as f:
c = csv.reader(f)
for row in c:
print row
My example.txt
has only the following content:
Hello world!
For UTF-8 or ANSI encoded files, this gives me the expected output:
> ["Hello world!"]
But if I save the file as UTF-8 with BOM I get this output:
> ["\xef\xbb\xbfHello world!"]
Since I do not have any control over what files the user will use as input, I would like this to work with BOM as well. How can I fix this problem? Is there anything I need to do to ensure that this works for other encodings as well?
utf-8-sig
for decoding. – Krystinakrystleimport csv,csvkit,codecs,unicodecsv with open("example.txt",'r') as f: c = csv.reader(f) for row in c: print [unicode(s, "utf-8") for s in row] with open("example.txt",'r') as f: c = unicodecsv.reader(f) for row in c: print row with open("example.txt",'r') as f: c = csvkit.reader(f) for row in c: print row
all prints[u'\ufeffHello world!']
so i ithink it is not duplicate- first try is using #17245915 – Alejoautf-8-sig
; but some of the other answers don't - which is why I added a comment here. – Krystinakrystle