Get parent element after using find method (xml.etree.ElementTree)
Asked Answered
I

2

15

I am working with a huge xml-file and try to extract information from different elements.

import xml.etree.ElementTree as ET
tree = ET.parse('t.xml')
root = tree.getroot()

To find the elements I use the find method:

elm = root.find('.//Element[@elmid="1234"]')

From this I extract information and in addition I need information from the parent element. But elm.find('..') returns only None as documented here: https://docs.python.org/3/library/xml.etree.elementtree.html

Now I use the folowing:

prt = root.find('.//Element[@elmid="1234"]/..')     
elm = prt.find('/Element[@elmid="1234"]')

This looks a bit unnatural to me, but works.

Do you know a better way to do this? Do you know why only None is returned?

Inconsolable answered 16/6, 2014 at 8:29 Comment(0)
M
25

The xml.etree API only supports a limited version of XPath. The xml.etree docs for the .. XPath expression state:

Selects the parent element. Returns None if the path attempts to reach the ancestors of the start element (the element find was called on).

Directly getting the parent element is not supported in the xml.etree API. I would therefore recommend to use lxml, where you can simply use getparent() to get the parent element:

elm = root.find('.//Element[@elmid="1234"]')
elm.getparent()

lxml also has a full XPath 1.0 implementation, so elem.xpath('..') would work as well.

Martinic answered 16/6, 2014 at 8:43 Comment(0)
C
3

I had a similar problem and I got a bit creative. Turns out nothing prevents us from adding the parentage info ourselves. We can later strip it once we no longer need it.

def addParentInfo(et):
    for child in et:
        child.attrib['__my_parent__'] = et
        addParentInfo(child)

def stripParentInfo(et):
    for child in et:
        child.attrib.pop('__my_parent__', 'None')
        stripParentInfo(child)

def getParent(et):
    if '__my_parent__' in et.attrib:
        return et.attrib['__my_parent__']
    else:
        return None

tree = ...
addParentInfo(tree.getroot())
el = tree.findall(...)[0]
parent = getParent(el)
while parent:
    ...
    parent = getParent(parent)
...
stripParentInfo(tree.getroot())
Concordia answered 1/3, 2019 at 11:43 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.