UnicodeEncodeError: 'ascii' codec can't encode characters

About

Asked 21/11, 2012 at 12:38 Answered 21/11, 2012 at 12:46

I have a dict that's feed with url response. Like:

>>> d
{
0: {'data': u'<p>found "\u62c9\u67cf \u591a\u516c \u56ed"</p>'}
1: {'data': u'<p>some other data</p>'}
...
}

While using xml.etree.ElementTree function on this data values (d[0]['data']) I get the most famous error message:

UnicodeEncodeError: 'ascii' codec can't encode characters...

What should I do to this Unicode string to make it suitable for ElementTree parser?

PS. Please don't send me links with Unicode & Python explanation. I read it all already unfortunately, and can't make use of it, as hopefully others can.

Gastroenterostomy answered 21/11, 2012 at 12:38 Comment(0)

You'll have to encode it manually, to UTF-8:

ElementTree.fromstring(d[0]['data'].encode('utf-8'))

as the API only takes encoded bytes as input. UTF-8 is a good default for such data.

It'll be able to decode to unicode again from there:

>>> from xml.etree import ElementTree
>>> p = ElementTree.fromstring(u'<p>found "\u62c9\u67cf \u591a\u516c \u56ed"</p>'.encode('utf8'))
>>> p.text
u'found "\u62c9\u67cf \u591a\u516c \u56ed"'
>>> print p.text
found "拉柏 多公 园"

Deprived answered 21/11, 2012 at 12:46 Comment(4)

Yes, that was the first thing I tried and I always try. Problem is with ElementTree.tostring. Can you please try ElementTree.tostring(p, method='text') and tell why it doesn't work? Thanks – Gastroenterostomy 21/11, 2012 at 12:54

Ah, sorry. It was too obvious. .tostring() has optional argument 'encoding' which is probably set to ascii by default, so adding encoding='utf-8' works. Cheers – Gastroenterostomy 21/11, 2012 at 12:56

@theta: Hehe, just about to tell you about that. :-) – Deprived 21/11, 2012 at 12:57

for more information, check this issue: bugs.python.org/issue11033 – Viticulture 23/5, 2018 at 5:15

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags