Elementtree setting attribute order
Asked Answered
Z

1

6

I am trying to write a python script to standardise generic XML files, used to configure websites and website forms. However to do this I would like to either maintain the original attribute ordering of the elements, or even better be able to rearrange them in a pre-defined way. Currently most xml parsers I have tried re-write the attribute order to be alpha-numeric. As these XML files are human read/written and maintained, this isn't too useful.

For example a generic element may look like this in the XML;

<Question QuestionRef="XXXXX" DataType="Integer" Text="Question Text" Availability="Shown" DefaultAnswer="X">

However once passed through elementtree and re-written to a new file this is changed to:

<Question Availability="Shown" DataType="Integer" DefaultAnswer="X" PartType="X" QuestionRef="XXXXX" Text="Question Text">

As the aim of the script is to standardise a large number of XML files in order to increase readability between colleagues and that the information contained within the element's attributes have varying levels of significance (Eg. QuestionRef is highly important), dicates that attributes need to be sensibly ordered.

I understand that python dicts (which attributes are stored in) are naturally unordered and XML specification states attribute ordering is insignificant, but this the human readability factor is the driving force behind the script.

In other questions (on Stack Overflow) similar to this one I have seen it remarked that pxdom can do this (question link: link), but I cannot find any mention of how it may to do this in pxdom documentation or using a google search. So is there some way to maintain an order of attributes or define it with current XML parsers? Preferably without resorting to hotpatching :)!

Any help anyone can provide would be greatly appreciated :).

Zeebrugge answered 10/1, 2013 at 12:28 Comment(3)
My first though is that maybe you should be using a different file format (such as yaml?) for human editing - and then translate that to XML if required by systems...Tildy
If the order of the attributes is important, have you considered using sub elements (e.g: <Availability>Shown</Availability>) instead? There, the order is definitely preserved.Morven
Problem is the XML is passed to C# code which generates the websites in HTML and javscript. This code has be developed over roughly 5 years, so rather bulky. Hence switching from XML is unfeasible as is moving the availability to a sub-element. Thanks for the help though :)Zeebrugge
I
9

Apply monkey patch as mentioned below::
in ElementTree.py file, there is a function named as _serialize_xml;
in this function; apply the below mentioned patch;

        ##for k, v in sorted(items):  # remove the sorted here
        for k, v in items:
            if isinstance(k, QName):
                k = k.text
            if isinstance(v, QName):
                v = qnames[v.text]
            else:
                v = _escape_attrib(v, encoding)
            write(" %s=\"%s\"" % (qnames[k], v))

here; remove the sorted(items) and make it just items like i have done above.

Also to disable sorting based on namespace(because in above patch; sorting is still present when namespace is present for xml attribute; otherwise if namespace is not present; then above is working fine); so to do that, replace all {} with collections.OrderedDict() from ElementTree.py

Now you have all attributes in a order as you have added them to that xml element.

Before doing all of above; read the copyright message by Fredrik Lundh that is present in ElementTree.py

Irresolution answered 10/1, 2013 at 13:15 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.