How do I get properly escaped XML in python etree untouched? - McMap

About

How do I get properly escaped XML in python etree untouched?

Asked 7/5, 2014 at 11:33 Answered 7/5, 2014 at 11:37

python xml xml.etree

O

1

8

I'm using python version 2.7.3.

test.txt:

<?xml version="1.0" encoding="UTF-8"?>
<root>
    <test>The tag &lt;StackOverflow&gt; is good to bring up at parties.</test>
</root>

Result:

>>> import xml.etree.ElementTree as ET
>>> e = ET.parse('test.txt')
>>> root = e.getroot()
>>> print root.find('test').text
The tag <StackOverflow> is good to bring up at parties.

As you can see, the parser must have changed the <'s to <'s etc.

What I'd like to see:

The tag <StackOverflow> is good to bring up at parties.

Untouched, raw text. Sometimes I really like it raw. Uncooked.

I'd like to use this text as-is for display within HTML, therefore I don't want an XML parser to mess with it.

Do I have to re-escape each string or can there be another way?

Omni answered 7/5, 2014 at 11:33 Comment(2)

For displaying in other sources, simply re-escape! It's a parser's job to give you the proper XML contents after parsing, and HTML escaping can be subtly different anyway. – Sponson 7/5, 2014 at 11:34

Fair point, will probably do that. Just was curious if there's some option in the parser or such. – Omni 7/5, 2014 at 11:58

B

5

import xml.etree.ElementTree as ET
e = ET.parse('test.txt')
root = e.getroot()
print(ET.tostring(root.find('test')))

yields

<test>The tag &lt;StackOverflow&gt; is good to bring up at parties.</test>

Alternatively, you could escape the text with saxutils.escape:

import xml.sax.saxutils as saxutils
print(saxutils.escape(root.find('test').text))

yields

The tag &lt;StackOverflow&gt; is good to bring up at parties.

Baily answered 7/5, 2014 at 11:37 Comment(1)

Both cases simply re-escape the value. – Sponson 7/5, 2014 at 11:51

Recommended topics

#Godot #Unity #Godot 4.X #Mongodb

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

© 2022 - 2024 — McMap. All rights reserved.