How to write unescaped string to a XML element with ElementTree?
Asked Answered
H

1

3

I have a string variable contents with following value:

<ph type="0" x="1"></ph>

I try to write it to a XML element as follows:

elemen_ref.text = contents

After I write XML tree to a file and check it with Notepad++, I see following value written to the XML element:

&lt;ph type="0" x="1"&gt;&lt;/ph&gt;

How do I write unescaped string? Please note that this value is copied from another XML element which remains intact after writing tree to a file, so the issue is with assigning value to a text attribute.

Hamlet answered 9/12, 2018 at 11:40 Comment(0)
M
4

You are attempting to do this:

import xml.etree.ElementTree as ET

root = ET.Element('root')
content_str = '<ph type="0" x="1"></ph>'
root.text = content_str

print(ET.tostring(root))
#  <root>&lt;ph type="0" x="1"&gt;&lt;/ph&gt;</root>

Which is essentially "injecting" XML to an element's text property. This is not the correct way to do this.

Instead, you should convert the content string to an actual XML node that can be appended to the existing XML node.

import xml.etree.ElementTree as ET

root = ET.Element('root')
content_str = '<ph type="0" x="1"></ph>'
content_element = ET.fromstring(content_str)
root.append(content_element)

print(ET.tostring(root))
#  <root><ph type="0" x="1" /></root>

If you insist, you can use unescape:

import xml.etree.ElementTree as ET
from xml.sax.saxutils import unescape

root = ET.Element('root')
content_str = '<ph type="0" x="1"></ph>'
root.text = content_str

print(unescape(ET.tostring(root).decode()))
#  <root><ph type="0" x="1"></ph></root>
Mareah answered 9/12, 2018 at 11:49 Comment(2)
Converting content looked like a valid option, but it does not work with other strings of formatted text, for example: My test<pha type="0" /> and 16<bpt i="1" type="25" />th<ept i="1" /> November 2018Hamlet
I am not sure about second option of unescaping whole XML either, because I'm altering a single node in ~300 MB XML document, thus I'm afraid that unescaping whole thing will make unwanted modifications.Hamlet

© 2022 - 2024 — McMap. All rights reserved.