How to select all leaf nodes using XPath expression?
Asked Answered
A

4

53

I believe it's possible but couldn't figure out the syntax. Something like this:

xmlNode.SelectNodes("//*[count(child::*) <= 1]")

but this is not correct.

Aquaplane answered 13/10, 2010 at 18:4 Comment(1)
Good question, +1. See my answer for the probably shortest XPath expression that selects exactly all leaf nodes. :)Angrist
A
74

Use:

//node()[not(node())]

In case only element leaf nodes are wanted (and this needs clarification -- are elements that have non-element children considered leaf nodes?), then the following XPath expression selects them:

//*[not(*)]

Both expressions above are probably the shortest that select the desired nodes (either any-node or element -- leaf nodes).

Angrist answered 13/10, 2010 at 18:16 Comment(4)
Can you explain why this works? I've looked over the XPath syntax and some tutorials, but I can't quite understand why this works.Callous
@rrs: The first expression selects any node in the XML document that doesn't have any children -- this is what a leaf node is -- by definition. The second does something similar, but it selects any element that doesn't have a child - element.Angrist
I understand what it does, but not how it does it. Why/how does not(*) select leaf nodes/elements?Callous
not(*) means "does not have any element child" as "* selects all element children of the context node" as per the W3C XPath 1.0 recommendation: w3.org/TR/xpath/#path-abbrev (second bullet). This is a very short explanation, to go in depth, one needs a more or less full course in XPath. May I shamelessly recommend the second module of my Pluralsight training course on "XSLT 2.0 and 1.0 foundations"? The title of this course is "A Crash Course in XPath" and it is 70 minutes long: pluralsight.com/training/Courses/TableOfContents/…Angrist
S
31

Any elements with no element child

//*[not(child::*)]
Synder answered 13/10, 2010 at 18:26 Comment(5)
+1 Right answer. But it means: any elements with no element child. So, it will select elements with text node child, empty elements, elements with mixed content (text nodes, PI, comments)Cable
You probably meant: "Elements that don't have element-children" -- not "Elements with no children". It would be good to acknowledge this and to correct the text of your otherwise good answer.Angrist
@Dimitre Thanks for holding my hand. I am SO n00b.Synder
what is the difference with @DimitreNovatchev 's //*[not(*)]?Overeager
@lajarre, This is equivalent to the second expression in my answer -- only longer.Angrist
Q
2

Why less or equal to 1 ?

xmlNode.SelectNodes("//*[count(child::*) = 0]")

Make tests etc at this site http://www.whitebeam.org/library/guide/TechNotes/xpathtestbed.rhtm

Pretty helpful ..

Quarterage answered 13/10, 2010 at 18:17 Comment(2)
Thanks very much. This works great. So, it's more VB style equal. I thought it should be c-style because functions are case-sensitive. Why <= 1? I was confused by ChildNodes.Count which return 1 for <A>x</A>, but returns 0 for <A/>.Aquaplane
and @miliu: the count test is not needed. Check @Synder answer.Cable
G
0

I'm adding this XSLT answer since it seems google's front matches lack such a solution:

After a long struggle with extracting CDATA as XML, eventually, this expression worked best for me:

<xsl:template match="*[not(child::*)]/text()">
Gallaher answered 5/12, 2017 at 15:49 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.