How do I select text nodes with jQuery?
Asked Answered
W

12

408

I would like to get all descendant text nodes of an element, as a jQuery collection. What is the best way to do that?

Wold answered 18/11, 2008 at 13:45 Comment(0)
S
275

jQuery doesn't have a convenient function for this. You need to combine contents(), which will give just child nodes but includes text nodes, with find(), which gives all descendant elements but no text nodes. Here's what I've come up with:

var getTextNodesIn = function(el) {
    return $(el).find(":not(iframe)").addBack().contents().filter(function() {
        return this.nodeType == 3;
    });
};

getTextNodesIn(el);

Note: If you're using jQuery 1.7 or earlier, the code above will not work. To fix this, replace addBack() with andSelf(). andSelf() is deprecated in favour of addBack() from 1.8 onwards.

This is somewhat inefficient compared to pure DOM methods and has to include an ugly workaround for jQuery's overloading of its contents() function (thanks to @rabidsnail in the comments for pointing that out), so here is non-jQuery solution using a simple recursive function. The includeWhitespaceNodes parameter controls whether or not whitespace text nodes are included in the output (in jQuery they are automatically filtered out).

Update: Fixed bug when includeWhitespaceNodes is falsy.

function getTextNodesIn(node, includeWhitespaceNodes) {
    var textNodes = [], nonWhitespaceMatcher = /\S/;

    function getTextNodes(node) {
        if (node.nodeType == 3) {
            if (includeWhitespaceNodes || nonWhitespaceMatcher.test(node.nodeValue)) {
                textNodes.push(node);
            }
        } else {
            for (var i = 0, len = node.childNodes.length; i < len; ++i) {
                getTextNodes(node.childNodes[i]);
            }
        }
    }

    getTextNodes(node);
    return textNodes;
}

getTextNodesIn(el);
Select answered 9/12, 2010 at 15:13 Comment(19)
Can the element passed in, be the name of a div?Festinate
@crosenblum: You could call document.getElementById() first, if that's what you mean: var div = document.getElementById("foo"); var textNodes = getTextNodesIn(div);Select
Because of a bug in jQuery if you have any iframes in el you'll need to use .find(':not(iframe)') instead of .find('*') .Scrubby
@rabidsnail: I think, the use of .contents() anyways implies it will search through the iframe as well. I don't see how it could be a bug.Fry
bugs.jquery.com/ticket/11275 Whether this is actually a bug seems to be up for debate, but bug or not if you call find('*').contents() on a node that contains an iframe which hasn't been added to the dom you'll get an exception at an undefined point.Scrubby
@rabidsnail: OK, I think that's at least an annoyance (if not a bug) in jQuery and a point in favour of the plain DOM version. I'll edit my answer. Thanks.Select
andSelf() was deprecated in jQuery 1.8, you can use addBack() instead.Greening
You could consider nonWhitespace = /\S/ and if (includeWhitespaceNodes || nonWhitespace.test(node.nodeValue)) { which at least boasts greater simplicity (though it would respond differently to empty text nodes, if those are possible). I also think there could be improvement in the regex variable name... something like whitespaceMatcher or something to indicate what the variable is.Zaporozhye
@ErikE: I like descriptive variable names. I have a feeling I picked whitespace to avoid the code having horizontal scrollbars on my browser.Select
@ErikE: I agree with you on both counts and have edited my answer. Empty text nodes are indeed possible but will be treated the same by both !/^\s*$/.test() and /\S/.test() so there's no problem there.Select
oh, right, it was * not + so empty nodes were matched before. Glad you liked my suggestions!Zaporozhye
Great suggestion. I would recommend using Node.TextNode in place of 3 for better readability.Watchful
@BenS: I would but Node.TEXT_NODE isn't supported in IE <= 8.Select
This code has a bug in it. Right now when you pass false for including whitespace, it ONLY modifies whitespace nodes instead of excluding them. The line if (includeWhitespaceNodes || !nonWhitespaceMatcher.test(node.nodeValue)) should instead read: if (includeWhitespaceNodes || nonWhitespaceMatcher.test(node.nodeValue)).Gelman
@BrianGeihsler: You're right, thanks. I simplified the regular expression last November but failed to negate the condition. Wish I'd tested it now.Select
@TimDown I tried your method but it gives the nodes out of order. What must be done to have tags in order? I asked a separate question here #63270623Serdab
@Amanda: I think the non-jQuery version will give you nodes in document order.Select
@TimDown You mean the jquery version? I am using cheerio and I am not getting so.Serdab
@TimDown I came across an answer by AKX at #63270623 . This seems to work. How is it different from what you proposed in the answer? Could you please explain/help?Serdab
W
222

Jauco posted a good solution in a comment, so I'm copying it here:

$(elem)
  .contents()
  .filter(function() {
    return this.nodeType === 3; //Node.TEXT_NODE
  });
Wold answered 18/11, 2008 at 13:47 Comment(8)
actually $(elem) .contents() .filter(function() { return this.nodeType == Node.TEXT_NODE; }); is enoughAdne
IE7 doesn't define the Node global, so you have to use this.nodeType == 3, unfortunately: https://mcmap.net/q/25935/-node-text_node-and-ie7Wold
Does this not only return the text nodes that are the direct children of the element rather than descendants of the element as the OP requested?Select
I've just noticed that your first answer from 2008 was almost exactly what I independently came up with much later. Why did you edit it?Select
add .text() at the end if you want it so be a string. Otherwise it's still an object. Trying to show it in the document will end up displaying [Object object].Teapot
@ChristianOudard That would be really easy to polyfill, no? Would make your code a bit more legible.Mestas
this will not work when the text node is deep nested inside other elements, because the contents() method only returns the immediate children nodes, api.jquery.com/contentsMethylamine
@Jauco, nope, not enough! as .contents() returns only the immediate children nodesMethylamine
S
20
$('body').find('*').contents().filter(function () { return this.nodeType === 3; });
Saintjust answered 19/10, 2011 at 16:7 Comment(0)
B
8

jQuery.contents() can be used with jQuery.filter to find all child text nodes. With a little twist, you can find grandchildren text nodes as well. No recursion required:

$(function() {
  var $textNodes = $("#test, #test *").contents().filter(function() {
    return this.nodeType === Node.TEXT_NODE;
  });
  /*
   * for testing
   */
  $textNodes.each(function() {
    console.log(this);
  });
});
div { margin-left: 1em; }
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js"></script>

<div id="test">
  child text 1<br>
  child text 2
  <div>
    grandchild text 1
    <div>grand-grandchild text 1</div>
    grandchild text 2
  </div>
  child text 3<br>
  child text 4
</div>

jsFiddle

Belvia answered 19/1, 2014 at 10:29 Comment(1)
I tried this. It prints tag names out of order. Is there a way to print tag names in the order they occur? I asked a separate question here #63276878Serdab
G
4

I was getting a lot of empty text nodes with the accepted filter function. If you're only interested in selecting text nodes that contain non-whitespace, try adding a nodeValue conditional to your filter function, like a simple $.trim(this.nodevalue) !== '':

$('element')
    .contents()
    .filter(function(){
        return this.nodeType === 3 && $.trim(this.nodeValue) !== '';
    });

http://jsfiddle.net/ptp6m97v/

Or to avoid strange situations where the content looks like whitespace, but is not (e.g. the soft hyphen &shy; character, newlines \n, tabs, etc.), you can try using a Regular Expression. For example, \S will match any non-whitespace characters:

$('element')
        .contents()
        .filter(function(){
            return this.nodeType === 3 && /\S/.test(this.nodeValue);
        });
Grassi answered 13/10, 2014 at 18:3 Comment(1)
I tried this. It prints tag names out of order. Is there a way to print tag names in the order they occur? I asked a separate question here #63276878Serdab
M
3

If you can make the assumption that all children are either Element Nodes or Text Nodes, then this is one solution.

To get all child text nodes as a jquery collection:

$('selector').clone().children().remove().end().contents();

To get a copy of the original element with non-text children removed:

$('selector').clone().children().remove().end();
Metalanguage answered 20/4, 2011 at 14:52 Comment(1)
Just noticed Tim Down's comment on another answer. This solution only gets the direct children, not all descendents.Metalanguage
V
1

Can also be done like this:

var textContents = $(document.getElementById("ElementId").childNodes).filter(function(){
        return this.nodeType == 3;
});

The above code filters the textNodes from direct children child nodes of a given element.

Vergne answered 19/6, 2013 at 11:24 Comment(1)
... but not all the descendant child nodes (e.g. a text node that is the child of an element that is a child of the original element).Select
M
1

For some reason contents() didn't work for me, so if it didn't work for you, here's a solution I made, I created jQuery.fn.descendants with the option to include text nodes or not

Usage


Get all descendants including text nodes and element nodes

jQuery('body').descendants('all');

Get all descendants returning only text nodes

jQuery('body').descendants(true);

Get all descendants returning only element nodes

jQuery('body').descendants();

Coffeescript Original:

jQuery.fn.descendants = ( textNodes ) ->

    # if textNodes is 'all' then textNodes and elementNodes are allowed
    # if textNodes if true then only textNodes will be returned
    # if textNodes is not provided as an argument then only element nodes
    # will be returned

    allowedTypes = if textNodes is 'all' then [1,3] else if textNodes then [3] else [1]

    # nodes we find
    nodes = []


    dig = (node) ->

        # loop through children
        for child in node.childNodes

            # push child to collection if has allowed type
            nodes.push(child) if child.nodeType in allowedTypes

            # dig through child if has children
            dig child if child.childNodes.length


    # loop and dig through nodes in the current
    # jQuery object
    dig node for node in this


    # wrap with jQuery
    return jQuery(nodes)

Drop In Javascript Version

var __indexOf=[].indexOf||function(e){for(var t=0,n=this.length;t<n;t++){if(t in this&&this[t]===e)return t}return-1}; /* indexOf polyfill ends here*/ jQuery.fn.descendants=function(e){var t,n,r,i,s,o;t=e==="all"?[1,3]:e?[3]:[1];i=[];n=function(e){var r,s,o,u,a,f;u=e.childNodes;f=[];for(s=0,o=u.length;s<o;s++){r=u[s];if(a=r.nodeType,__indexOf.call(t,a)>=0){i.push(r)}if(r.childNodes.length){f.push(n(r))}else{f.push(void 0)}}return f};for(s=0,o=this.length;s<o;s++){r=this[s];n(r)}return jQuery(i)}

Unminified Javascript version: http://pastebin.com/cX3jMfuD

This is cross browser, a small Array.indexOf polyfill is included in the code.

Mutiny answered 16/2, 2014 at 5:16 Comment(0)
S
0

if you want to strip all tags, then try this

function:

String.prototype.stripTags=function(){
var rtag=/<.*?[^>]>/g;
return this.replace(rtag,'');
}

usage:

var newText=$('selector').html().stripTags();
Showker answered 22/6, 2011 at 18:36 Comment(0)
W
0

I had the same problem and solved it with:

Code:

$.fn.nextNode = function(){
  var contents = $(this).parent().contents();
  return contents.get(contents.index(this)+1);
}

Usage:

$('#my_id').nextNode();

Is like next() but also returns the text nodes.

Written answered 30/7, 2011 at 17:47 Comment(1)
.nextSibling is from Dom specification: developer.mozilla.org/en/Document_Object_Model_(DOM)/…Written
R
0

For me, plain old .contents() appeared to work to return the text nodes, just have to be careful with your selectors so that you know they will be text nodes.

For example, this wrapped all the text content of the TDs in my table with pre tags and had no problems.

jQuery("#resultTable td").content().wrap("<pre/>")
Redouble answered 23/8, 2013 at 14:41 Comment(0)
G
0

This gets the job done regardless of the tag names. Select your parent.

It gives an array of strings with no duplications for parents and their children.

$('parent')
.find(":not(iframe)")
.addBack()
.contents()
.filter(function() {return this.nodeType == 3;})
//.map((i,v) => $(v).text()) // uncomment if you want strings
Ginaginder answered 3/12, 2022 at 17:44 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.