E

Background

I am using SQLite to access a database and retrieve the desired information. I'm using ElementTree in Python version 2.6 to create an XML file with that information.

Code

import sqlite3
import xml.etree.ElementTree as ET

# NOTE: Omitted code where I acccess the database,
# pull data, and add elements to the tree

tree = ET.ElementTree(root)

# Pretty printing to Python shell for testing purposes
from xml.dom import minidom
print minidom.parseString(ET.tostring(root)).toprettyxml(indent = "   ")

#######  Here lies my problem  #######
tree.write("New_Database.xml")

Attempts

I've tried using tree.write("New_Database.xml", "utf-8") in place of the last line of code above, but it did not edit the XML's layout at all - it's still a jumbled mess.

I also decided to fiddle around and tried doing:
tree = minidom.parseString(ET.tostring(root)).toprettyxml(indent = " ") instead of printing this to the Python shell, which gives the error AttributeError: 'unicode' object has no attribute 'write'.

Questions

When I write my tree to an XML file on the last line, is there a way to pretty print to the XML file as it does to the Python shell?

Can I use toprettyxml() here or is there a different way to do this?

Ebberta answered 2/3, 2015 at 15:48 Comment(1)

Related: Use xml.etree.elementtree to print nicely formatted xml files – Castiron 4/6, 2018 at 20:0

D

88

Whatever your XML string is, you can write it to the file of your choice by opening a file for writing and writing the string to the file.

from xml.dom import minidom

xmlstr = minidom.parseString(ET.tostring(root)).toprettyxml(indent="   ")
with open("New_Database.xml", "w") as f:
    f.write(xmlstr)

There is one possible complication, especially in Python 2, which is both less strict and less sophisticated about Unicode characters in strings. If your toprettyxml method hands back a Unicode string (u"something"), then you may want to cast it to a suitable file encoding, such as UTF-8. E.g. replace the one write line with:

f.write(xmlstr.encode('utf-8'))

Dissolve answered 2/3, 2015 at 15:57 Comment(8)

This answer would be clearer if you included the import xml.dom.minidom as minidom statement that appears to be required. – Clotilde 16/5, 2017 at 15:47

@KenPronovici Possibly. That import appears in the original question, but I've added it here so there's no confusion. – Dissolve 16/5, 2017 at 16:10

This answer is repeated so often on any kind of questions, but it is anything but a good answer: You fully need to convert the whole XML tree to a string, reparse it, to again get it printed, this time just differently. This is not a good approach. Use lxml instead and serialize directly using the builtin method provided by lxml, this way eliminating any interemediate printing followed by reparsing. – Lucindalucine 1/8, 2017 at 18:30

This is an answer about how serialized XML gets written to file, not an endorsement of the OP's serialization strategy, which is undoubtedly Byzantine. I love lxml, but being based on C, it's not always available. – Dissolve 1/8, 2017 at 22:53

In case one wants to use lxml might look at my answer below. – Smaragd 10/5, 2018 at 15:42

This also uses both minidom and elementtree. It would better if it were just one or the other. – Anorexia 20/11, 2019 at 22:56

Change .toprettyxml(indent=" ") to .toprettyxml(indent=" ", newl="\r") to remove all of the blank lines. – Polychaete 6/2, 2020 at 18:49

If you don't want the XML version tag that minidom adds, you can change it to f.write(xmlstr.split('\n', 1)[1]) – Butterfish 16/9, 2021 at 9:59

E

112

I simply solved it with the indent() function:

xml.etree.ElementTree.indent(tree, space=" ", level=0) Appends whitespace to the subtree to indent the tree visually. This can be used to generate pretty-printed XML output. tree can be an Element or ElementTree. space is the whitespace string that will be inserted for each indentation level, two space characters by default. For indenting partial subtrees inside of an already indented tree, pass the initial indentation level as level.

tree = ET.ElementTree(root)
ET.indent(tree, space="\t", level=0)
tree.write(file_name, encoding="utf-8")

Note, the indent() function was added in Python 3.9.

Exophthalmos answered 2/8, 2021 at 7:43 Comment(5)

It should be mentioned that the indent() function was added in Python 3.9. – Gray 25/8, 2021 at 11:40

You are the person. The very person. This is overwhelmingly the best answer. – Solutrean 21/9, 2021 at 3:11

Note that @Tatarize’s answer has effectively a polyfill for this that works on older Python versions. – Fixing 24/2, 2022 at 15:20

I am also overwhelmed with appreciation for this answer. – Irish 18/2, 2023 at 17:21

use space=" " if you want 4 spaces (there are 4 spaces between the quotes) – Unsuspecting 1/7, 2024 at 8:1

D

88

Whatever your XML string is, you can write it to the file of your choice by opening a file for writing and writing the string to the file.

from xml.dom import minidom

xmlstr = minidom.parseString(ET.tostring(root)).toprettyxml(indent="   ")
with open("New_Database.xml", "w") as f:
    f.write(xmlstr)

There is one possible complication, especially in Python 2, which is both less strict and less sophisticated about Unicode characters in strings. If your toprettyxml method hands back a Unicode string (u"something"), then you may want to cast it to a suitable file encoding, such as UTF-8. E.g. replace the one write line with:

f.write(xmlstr.encode('utf-8'))