ElementTree TypeError "write() argument must be str, not bytes" in Python3
Asked Answered
B

5

21

Got a Problem with generating a .SVG File with Python3 and ElementTree.

    from xml.etree import ElementTree as et
    doc = et.Element('svg', width='480', height='360', version='1.1', xmlns='http://www.w3.org/2000/svg')

    #Doing things with et and doc

    f = open('sample.svg', 'w')
    f.write('<?xml version=\"1.0\" standalone=\"no\"?>\n')
    f.write('<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n')
    f.write('\"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n')
    f.write(et.tostring(doc))
    f.close()

The Function et.tostring(doc) generates the TypeError "write() argument must be str, not bytes". I don't understand that behavior, "et" should convert the ElementTree-Element into a string? It works in python2, but not in python3. What did i do wrong?

Bourassa answered 27/2, 2017 at 7:57 Comment(3)
Did you check the documentation? See this page and search for tostring. Does that help?Carden
not really, it should be decoded already in utf-8 bytestring, but python3 seems to have a problem with thatBourassa
Primary cause is Python lazy "dynamic typing" mess. This code (which doesn't specify output encoding in open() as it should) writes to the file in text mode. But ElementTree.write() wants binmode and indeed et.toString() returns bytes (typing!). The encoding for what will be "XML text" must be given to ElementTree.write() instead! This is due to the XML header (which here is written in the second line, but should not) which is written by ET.ElementTree if 'xml_declaration' = True which contains the encoding's name! It think that is called "non-transparency of lower layers".Instrumentation
C
29

As it turns out, tostring, despite its name, really does return an object whose type is bytes.

Stranger things have happened. Anyway, here's the proof:

>>> from xml.etree.ElementTree import ElementTree, tostring
>>> import xml.etree.ElementTree as ET
>>> element = ET.fromstring("<a></a>")
>>> type(tostring(element))
<class 'bytes'>

Silly, isn't it?

Fortunately you can do this:

>>> type(tostring(element, encoding="unicode"))
<class 'str'>

Yes, we all thought the ridiculousness of bytes and that ancient, forty-plus-year-old-and-obsolete encoding called ascii was dead.

And don't get me started on the fact that they call "unicode" an encoding!!!!!!!!!!!

Carden answered 27/2, 2017 at 8:11 Comment(3)
It was fun to test out. I couldn't believe it when I saw the result of type(tostring(element)). And then seeing the result change because of a change to a parameter value. Wow. That was really weird. Nice question.Carden
🤔I don't see why bytes are "ridiculous"? ASCII is in no way obsolete, though restricting oneself to 7 bit is kinda old-school, it's true. Also the unicode keyword is not indicating an "encoding". The semantics is, if the call attribute "encoding" is unicode, then do not do any encoding, just return a Python string (which does not have any encoding). Properly there should be two methods: to_string() (yields character stream without encoding) and to_binary() (yields byte stream with the desired encoding) to make this API clean.Instrumentation
Was meant to be sarcastic, but I can see that it was not obvious (seven years later :))Carden
K
11

The output file should be in binary mode.

f = open('sample.svg', 'wb')
Kessler answered 10/9, 2019 at 1:46 Comment(0)
M
5

Try:

f.write(et.tostring(doc).decode(encoding))

Example:

f.write(et.tostring(doc).decode("utf-8"))
Marino answered 27/2, 2017 at 8:4 Comment(0)
A
3

Specify encoding of string while writing the xml file.

Like decode(UTF-8) with write(). Example: file.write(etree.tostring(doc).decode(UTF-8))

Afeard answered 27/2, 2017 at 8:17 Comment(0)
H
1

For me it was the easiest to create first some template xml (just defining the root) and then parse it...

docXml = ET.parse('template.xml')
root = docXml.getroot()

then doing what I wanted to do in my xml and them print it...

docXml.write("output.xml", encoding="utf-8")
Highpriced answered 17/7, 2018 at 9:22 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.