ElementTree XPath - Select Element based on attribute
Asked Answered
R

2

49

I am having trouble using the attribute XPath Selector in ElementTree, which I should be able to do according to the Documentation

Here's some sample code

XML

<root>
 <target name="1">
    <a></a>
    <b></b>
 </target>
 <target name="2">
    <a></a>
    <b></b>
 </target>
</root>

Python

def parse(document):
    root = et.parse(document)
    for target in root.findall("//target[@name='a']"):
        print target._children

I am receiving the following Exception:

expected path separator ([)
Ruphina answered 21/10, 2008 at 15:52 Comment(3)
Using ElementTree 1.2.6, the attribute xpath features are only available in 1.3 and beyond.Ruphina
Looks like findall only supports a subset XPath. See the mailing list discussion here.Chopine
Why close this? It was useful for me... It is hardly off topic.Stets
C
36

The syntax you're trying to use is new in ElementTree 1.3.

Such version is shipped with Python 2.7 or higher. If you have Python 2.6 or less you still have ElementTree 1.2.6 or less.

Coraleecoralie answered 21/10, 2008 at 16:16 Comment(0)
H
34

There are several problems in this code.

  1. Python's buildin ElementTree (ET for short) has no real XPATH support; only a limited subset By example, it doesn't support find-from-root expressions like //target.

    Notice: the documentation mentions "//", but only for children: So an expression as .//target is valid; //... is not!

    There is an alternative implementation: lxml which is more rich. It's seams that documentation is used, for the build-in code. That does not match/work.

  2. The @name notation selects xml-attributes; the key=value expression within an xml-tag.

    So that name-value has to be 1 or 2 to select something in the given document. Or, one can search for targets with a child element 'a': target[a] (no @).

For the given document, parsed with the build-in ElementTree (v1.3) to root, the following code is correct and working:

  • root.findall(".//target") Find both targets
  • root.findall(".//target/a") Find two a-element
  • root.findall(".//target[a]") This finds both target-element again, as both have an a-element
  • root.findall(".//target[@name='1']") Find only the first target. Notice the quotes around 1 are needed; else a SyntaxError is raised
  • root.findall(".//target[a][@name='1']") Also valid; to find that target
  • root.findall(".//target[@name='1']/a") Finds only one a-element; ...
Huntingdon answered 19/4, 2013 at 13:0 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.