So basically, I want to generate an XML with elements generated from data in a python dictionary, where what will come to be tags are the dictionary's keys, and the text the dictionary's values. I have no need to give attributes to the items, and my desired output would look something like this:
<AllItems>
<Item>
<some_tag> Hello World </some_tag>
...
<another_tag />
</Item>
<Item> ... </Item>
...
</AllItems>
I have tried using the xml.etree.ElementTree package, by creating a tree, setting an Element "AllItems" as the root like so:
from xml.etree import ElementTree as et
def dict_to_elem(dictionary):
item = et.Element('Item')
for key in dictionary:
field = et.Element(key.replace(' ',''))
field.text = dictionary[key]
item.append(field)
return item
newtree = et.ElementTree()
root = et.Element('AllItems')
newtree._setroot(root)
root.append(dict_to_elem( {'some_tag':'Hello World', ...} )
# Lather, rinse, repeat this append step as needed
with open( filename , 'w', encoding='utf-8') as file:
tree.write(file, encoding='unicode')
In the last two lines, I have tried omitting the encoding in the open() statement, omitting and changing to 'UTF-8' the encoding in the write() method, and I either get an error that "') is type str is not serializable
So my problem - All I want to know is how should I be going about creating a UTF-8 XML from scratch with the format above, and is there a more robust solution using another package, that will properly allow me to handle UTF-8 characters? I'm not married to ElementTree for a solution, but I would prefer not to have to create a schema. Thanks in advance for any advice/solutions!
tree.write()
uses ascii without explicitencoding
parameter. It converts all non-ascii characters to xml character references e.g.,'☺' -> '☺'
. – Liesa