Only select text directly in node, not in child nodes

About

Asked 19/12, 2010 at 16:50 Answered 25/4, 2017 at 15:3

How does one retrieve the text in a node without selecting the text in the children?

<div id="comment">
     <div class="title">Editor's Description</div>
     <div class="changed">Last updated: </div>
     <br class="clear">
     Lorem ipsum dolor sit amet.
</div>

In other words, I want Lorem ipsum dolor sit amet. rather than Editor's DescriptionLast updated: Lorem ipsum dolor sit amet.

Navarino answered 19/12, 2010 at 16:50 Comment(0)

In the provided XML document:

<div id="comment">
      <div class="title">Editor's Description</div>
      <div class="changed">Last updated: </div>
      <br class="clear">
      Lorem ipsum dolor sit amet. 
</div>

the top element /div has 4 children nodes that are text nodes. The first three of these four text-node children are whitespace-only. The last of these 4 text-node children is the one that is wanted.

Use:

/div/text()[last()]

This is different from:

/div/text()

The latter may (depending on whether whitespace-only nodes are preserved by the XML parser) select all 4 text nodes, but you only want the last of them.

An alternative is (when you don't know exactly which text-node you want):

/div/text()[normalize-space()]

This selects all text-node-children of /div that are not whitespace-only text nodes.

Ewers answered 19/12, 2010 at 17:3 Comment(13)

@Dimitre, the question is to select the text without child nodes, the first suggestion by you doesn't do this. – Fourgon 19/12, 2010 at 17:8

@Lucero: Why? I haven't suggested the use of the descendant:: axis or the // abbreviation. The first expression selects just one text node: the last child text node of /div. the alternative selects any child text node of /div that is not whitespace-only. – Ewers 19/12, 2010 at 17:14

@Dimitre, simply because nothing says that the wanted text will be the last node? – Fourgon 19/12, 2010 at 17:16

@Lucero: I have edited my answer to make it more clear. Hope you understand it now. – Ewers 19/12, 2010 at 17:28

@Dimitre, the question was to get the text without the text of the child nodes. Getting the last text node only is working for the given sample, but not answering the question in general. – Fourgon 19/12, 2010 at 17:28

@Lucero: I think that the edited answer meets your objections -- it explains the two alternatives one has: either know exactly which node you want to select, or select all text nodes that are not white-space only. Both expressions avoid selecting whitespace-only text nodes -- something that may happen using your suggested solution. Do note that the OP really wants only non-whitespace-only text nodes. – Ewers 19/12, 2010 at 17:33

@Dimitre, in fact the white space stripping was useful as well, thanks to both – Navarino 19/12, 2010 at 17:38

I just don't get why both of the solutions don't work for me in Firefox with XPather, but //div/text()[normalize-space() and parent::div[@id='comment']] is fine. – Lynnet 19/12, 2010 at 17:45

@styu: Then you are evaluating the XPath expressions against a different XML document (not against the provided XML document) – Ewers 19/12, 2010 at 17:52

@Dimitre I think it's an issue with XPather. Your XPath Visualizer and an other one works fine, thanks. – Lynnet 19/12, 2010 at 18:53

This does not solve the answer for me. I need the xpath result to be in the form of a webelement, not a String, and so using /text() is not an option. – Toffeenosed 4/6, 2015 at 22:44

@djangofan, text() selects all text-node children of the current node -- not strings as you believe. As for "webelements", no such thing exists in XPath. – Ewers 4/6, 2015 at 22:50

@SeanDuggan, Yes, XPath is a very elegant and powerful language. – Ewers 4/11, 2015 at 2:48

Just select text() instead of .:

div/text()

On the given XML fragment, this returns:

Lorem ipsum dolor sit amet.

Fourgon answered 19/12, 2010 at 16:56 Comment(0)

How about this :
$doc/node()[3]/text()
Assuming $doc has the xml.

Chariot answered 25/4, 2017 at 15:3 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags