How do I convert ElementTree.Element
to a String?
For Python 3:
xml_str = ElementTree.tostring(xml, encoding='unicode')
For Python 2:
xml_str = ElementTree.tostring(xml, encoding='utf-8')
Example usage (Python 3)
from xml.etree import ElementTree
xml = ElementTree.Element("Person", Name="John")
xml_str = ElementTree.tostring(xml, encoding='unicode')
print(xml_str)
Output:
<Person Name="John" />
Explanation
ElementTree.tostring()
returns a bytestring by default in Python 2 & 3. This is an issue because Python 3 switched to using Unicode for strings.
In Python 2 you could use the str
type for both text and binary data.
Unfortunately this confluence of two different concepts could lead to
brittle code which sometimes worked for either kind of data, sometimes
not. [...]
To make the distinction between text and binary data clearer and more pronounced, [Python 3] made text and binary data distinct types that cannot blindly be mixed together.
Source: Porting Python 2 Code to Python 3
If you know what version of Python is being used, you should specify the encoding as unicode
or utf-8
. For reference, I've included a comparison of .tostring()
results between Python 2 and Python 3.
ElementTree.tostring(xml)
# Python 3: b'<Person Name="John" />'
# Python 2: <Person Name="John" />
ElementTree.tostring(xml, encoding='unicode')
# Python 3: <Person Name="John" />
# Python 2: LookupError: unknown encoding: unicode
ElementTree.tostring(xml, encoding='utf-8')
# Python 3: b'<Person Name="John" />'
# Python 2: <Person Name="John" />
ElementTree.tostring(xml).decode()
# Python 3: <Person Name="John" />
# Python 2: <Person Name="John" />
Note: While xml_str = ElementTree.tostring().decode()
is compatible with both Python 2 & 3, Christopher Rucinski pointed out that this method fails when dealing with non-Latin characters).
Thanks to Martijn Peters for pointing out that the str
datatype changed between Python 2 and 3.
Why not use str()?
In most scenarios, using str()
would be the "canonical" way to convert an object to a string. However, using str()
with Element
returns the object's location in memory as a hexstring, rather than a string representation of the object's data.
from xml.etree import ElementTree
xml = ElementTree.Element("Person", Name="John")
print(str(xml)) # <Element 'Person' at 0x00497A80>
ElementTree.tostring()
also generates a bytestring. Thestr
type is a bytestring in Python 2 (Python 3'sstr
type is calledunicode
in Python 2). – Purse