How do I select text nodes with jQuery?

Asked 18/11, 2008 at 13:45 Answered 3/12, 2022 at 17:44

408

I would like to get all descendant text nodes of an element, as a jQuery collection. What is the best way to do that?

Wold answered 18/11, 2008 at 13:45 Comment(0)

275

jQuery doesn't have a convenient function for this. You need to combine contents(), which will give just child nodes but includes text nodes, with find(), which gives all descendant elements but no text nodes. Here's what I've come up with:

var getTextNodesIn = function(el) {
    return $(el).find(":not(iframe)").addBack().contents().filter(function() {
        return this.nodeType == 3;
    });
};

getTextNodesIn(el);

Note: If you're using jQuery 1.7 or earlier, the code above will not work. To fix this, replace addBack() with andSelf(). andSelf() is deprecated in favour of addBack() from 1.8 onwards.

This is somewhat inefficient compared to pure DOM methods and has to include an ugly workaround for jQuery's overloading of its contents() function (thanks to @rabidsnail in the comments for pointing that out), so here is non-jQuery solution using a simple recursive function. The includeWhitespaceNodes parameter controls whether or not whitespace text nodes are included in the output (in jQuery they are automatically filtered out).

Update: Fixed bug when includeWhitespaceNodes is falsy.

function getTextNodesIn(node, includeWhitespaceNodes) {
    var textNodes = [], nonWhitespaceMatcher = /\S/;

    function getTextNodes(node) {
        if (node.nodeType == 3) {
            if (includeWhitespaceNodes || nonWhitespaceMatcher.test(node.nodeValue)) {
                textNodes.push(node);
            }
        } else {
            for (var i = 0, len = node.childNodes.length; i < len; ++i) {
                getTextNodes(node.childNodes[i]);
            }
        }
    }

    getTextNodes(node);
    return textNodes;
}

getTextNodesIn(el);

Select answered 9/12, 2010 at 15:13 Comment(19)

Can the element passed in, be the name of a div? – Festinate 10/2, 2011 at 15:56

@crosenblum: You could call document.getElementById() first, if that's what you mean: var div = document.getElementById("foo"); var textNodes = getTextNodesIn(div); – Select 10/2, 2011 at 16:43

Because of a bug in jQuery if you have any iframes in el you'll need to use .find(':not(iframe)') instead of .find('*') . – Scrubby 3/2, 2012 at 0:29

@rabidsnail: I think, the use of .contents() anyways implies it will search through the iframe as well. I don't see how it could be a bug. – Fry 6/2, 2012 at 11:52

bugs.jquery.com/ticket/11275 Whether this is actually a bug seems to be up for debate, but bug or not if you call find('*').contents() on a node that contains an iframe which hasn't been added to the dom you'll get an exception at an undefined point. – Scrubby 13/2, 2012 at 21:31

@rabidsnail: OK, I think that's at least an annoyance (if not a bug) in jQuery and a point in favour of the plain DOM version. I'll edit my answer. Thanks. – Select 13/2, 2012 at 21:57

andSelf() was deprecated in jQuery 1.8, you can use addBack() instead. – Greening 14/2, 2013 at 14:3

You could consider nonWhitespace = /\S/ and if (includeWhitespaceNodes || nonWhitespace.test(node.nodeValue)) { which at least boasts greater simplicity (though it would respond differently to empty text nodes, if those are possible). I also think there could be improvement in the regex variable name... something like whitespaceMatcher or something to indicate what the variable is. – Zaporozhye 18/11, 2013 at 23:29

@ErikE: I like descriptive variable names. I have a feeling I picked whitespace to avoid the code having horizontal scrollbars on my browser. – Select 18/11, 2013 at 23:42

@ErikE: I agree with you on both counts and have edited my answer. Empty text nodes are indeed possible but will be treated the same by both !/^\s*$/.test() and /\S/.test() so there's no problem there. – Select 18/11, 2013 at 23:55

oh, right, it was * not + so empty nodes were matched before. Glad you liked my suggestions! – Zaporozhye 19/11, 2013 at 0:3

Great suggestion. I would recommend using Node.TextNode in place of 3 for better readability. – Watchful 28/3, 2014 at 14:58

@BenS: I would but Node.TEXT_NODE isn't supported in IE <= 8. – Select 28/3, 2014 at 16:31

This code has a bug in it. Right now when you pass false for including whitespace, it ONLY modifies whitespace nodes instead of excluding them. The line if (includeWhitespaceNodes || !nonWhitespaceMatcher.test(node.nodeValue)) should instead read: if (includeWhitespaceNodes || nonWhitespaceMatcher.test(node.nodeValue)). – Gelman 11/6, 2014 at 18:55

@BrianGeihsler: You're right, thanks. I simplified the regular expression last November but failed to negate the condition. Wish I'd tested it now. – Select 11/6, 2014 at 22:35

@TimDown I tried your method but it gives the nodes out of order. What must be done to have tags in order? I asked a separate question here #63270623 – Serdab 6/8, 2020 at 4:16

@Amanda: I think the non-jQuery version will give you nodes in document order. – Select 6/8, 2020 at 8:5

@TimDown You mean the jquery version? I am using cheerio and I am not getting so. – Serdab 6/8, 2020 at 9:1

@TimDown I came across an answer by AKX at #63270623 . This seems to work. How is it different from what you proposed in the answer? Could you please explain/help? – Serdab 6/8, 2020 at 9:17

222

Jauco posted a good solution in a comment, so I'm copying it here:

$(elem)
  .contents()
  .filter(function() {
    return this.nodeType === 3; //Node.TEXT_NODE
  });

Wold answered 18/11, 2008 at 13:47 Comment(8)

actually $(elem) .contents() .filter(function() { return this.nodeType == Node.TEXT_NODE; }); is enough – Adne 11/7, 2009 at 13:53

IE7 doesn't define the Node global, so you have to use this.nodeType == 3, unfortunately: https://mcmap.net/q/25935/-node-text_node-and-ie7 – Wold 29/12, 2009 at 20:0

Does this not only return the text nodes that are the direct children of the element rather than descendants of the element as the OP requested? – Select 15/10, 2010 at 14:12

I've just noticed that your first answer from 2008 was almost exactly what I independently came up with much later. Why did you edit it? – Select 10/10, 2012 at 22:35

add .text() at the end if you want it so be a string. Otherwise it's still an object. Trying to show it in the document will end up displaying [Object object]. – Teapot 29/9, 2013 at 8:14

@ChristianOudard That would be really easy to polyfill, no? Would make your code a bit more legible. – Mestas 10/6, 2014 at 21:28

this will not work when the text node is deep nested inside other elements, because the contents() method only returns the immediate children nodes, api.jquery.com/contents – Methylamine 16/10, 2015 at 16:8

@Jauco, nope, not enough! as .contents() returns only the immediate children nodes – Methylamine 16/10, 2015 at 16:14

$('body').find('*').contents().filter(function () { return this.nodeType === 3; });

Saintjust answered 19/10, 2011 at 16:7 Comment(0)

jQuery.contents() can be used with jQuery.filter to find all child text nodes. With a little twist, you can find grandchildren text nodes as well. No recursion required:

$(function() {
  var $textNodes = $("#test, #test *").contents().filter(function() {
    return this.nodeType === Node.TEXT_NODE;
  });
  /*
   * for testing
   */
  $textNodes.each(function() {
    console.log(this);
  });
});

div { margin-left: 1em; }

<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js"></script>

<div id="test">
  child text 1<br>
  child text 2
  <div>
    grandchild text 1
    <div>grand-grandchild text 1</div>
    grandchild text 2
  </div>
  child text 3<br>
  child text 4
</div>

jsFiddle

Belvia answered 19/1, 2014 at 10:29 Comment(1)

I tried this. It prints tag names out of order. Is there a way to print tag names in the order they occur? I asked a separate question here #63276878 – Serdab 6/8, 2020 at 4:27

I was getting a lot of empty text nodes with the accepted filter function. If you're only interested in selecting text nodes that contain non-whitespace, try adding a nodeValue conditional to your filter function, like a simple $.trim(this.nodevalue) !== '':

$('element')
    .contents()
    .filter(function(){
        return this.nodeType === 3 && $.trim(this.nodeValue) !== '';
    });

http://jsfiddle.net/ptp6m97v/

Or to avoid strange situations where the content looks like whitespace, but is not (e.g. the soft hyphen  character, newlines \n, tabs, etc.), you can try using a Regular Expression. For example, \S will match any non-whitespace characters:

$('element')
        .contents()
        .filter(function(){
            return this.nodeType === 3 && /\S/.test(this.nodeValue);
        });

Grassi answered 13/10, 2014 at 18:3 Comment(1)

I tried this. It prints tag names out of order. Is there a way to print tag names in the order they occur? I asked a separate question here #63276878 – Serdab 6/8, 2020 at 4:28

If you can make the assumption that all children are either Element Nodes or Text Nodes, then this is one solution.

To get all child text nodes as a jquery collection:

$('selector').clone().children().remove().end().contents();

To get a copy of the original element with non-text children removed:

$('selector').clone().children().remove().end();

Metalanguage answered 20/4, 2011 at 14:52 Comment(1)

Just noticed Tim Down's comment on another answer. This solution only gets the direct children, not all descendents. – Metalanguage 20/4, 2011 at 14:58

Can also be done like this:

var textContents = $(document.getElementById("ElementId").childNodes).filter(function(){
        return this.nodeType == 3;
});

The above code filters the textNodes from direct children child nodes of a given element.

Vergne answered 19/6, 2013 at 11:24 Comment(1)

... but not all the descendant child nodes (e.g. a text node that is the child of an element that is a child of the original element). – Select 17/10, 2013 at 21:15

For some reason contents() didn't work for me, so if it didn't work for you, here's a solution I made, I created jQuery.fn.descendants with the option to include text nodes or not

Usage

Get all descendants including text nodes and element nodes

jQuery('body').descendants('all');

Get all descendants returning only text nodes

jQuery('body').descendants(true);

Get all descendants returning only element nodes

jQuery('body').descendants();

Coffeescript Original:

jQuery.fn.descendants = ( textNodes ) ->

    # if textNodes is 'all' then textNodes and elementNodes are allowed
    # if textNodes if true then only textNodes will be returned
    # if textNodes is not provided as an argument then only element nodes
    # will be returned

    allowedTypes = if textNodes is 'all' then [1,3] else if textNodes then [3] else [1]

    # nodes we find
    nodes = []


    dig = (node) ->

        # loop through children
        for child in node.childNodes

            # push child to collection if has allowed type
            nodes.push(child) if child.nodeType in allowedTypes

            # dig through child if has children
            dig child if child.childNodes.length


    # loop and dig through nodes in the current
    # jQuery object
    dig node for node in this


    # wrap with jQuery
    return jQuery(nodes)

Drop In Javascript Version

var __indexOf=[].indexOf||function(e){for(var t=0,n=this.length;t<n;t++){if(t in this&&this[t]===e)return t}return-1}; /* indexOf polyfill ends here*/ jQuery.fn.descendants=function(e){var t,n,r,i,s,o;t=e==="all"?[1,3]:e?[3]:[1];i=[];n=function(e){var r,s,o,u,a,f;u=e.childNodes;f=[];for(s=0,o=u.length;s<o;s++){r=u[s];if(a=r.nodeType,__indexOf.call(t,a)>=0){i.push(r)}if(r.childNodes.length){f.push(n(r))}else{f.push(void 0)}}return f};for(s=0,o=this.length;s<o;s++){r=this[s];n(r)}return jQuery(i)}

Unminified Javascript version: http://pastebin.com/cX3jMfuD

This is cross browser, a small Array.indexOf polyfill is included in the code.

Mutiny answered 16/2, 2014 at 5:16 Comment(0)

if you want to strip all tags, then try this

function:

String.prototype.stripTags=function(){
var rtag=/<.*?[^>]>/g;
return this.replace(rtag,'');
}

usage:

var newText=$('selector').html().stripTags();

Showker answered 22/6, 2011 at 18:36 Comment(0)

I had the same problem and solved it with:

Code:

$.fn.nextNode = function(){
  var contents = $(this).parent().contents();
  return contents.get(contents.index(this)+1);
}

Usage:

$('#my_id').nextNode();

Is like next() but also returns the text nodes.

Written answered 30/7, 2011 at 17:47 Comment(1)

.nextSibling is from Dom specification: developer.mozilla.org/en/Document_Object_Model_(DOM)/… – Written 14/2, 2012 at 10:15

For me, plain old .contents() appeared to work to return the text nodes, just have to be careful with your selectors so that you know they will be text nodes.

For example, this wrapped all the text content of the TDs in my table with pre tags and had no problems.

jQuery("#resultTable td").content().wrap("<pre/>")

Redouble answered 23/8, 2013 at 14:41 Comment(0)

This gets the job done regardless of the tag names. Select your parent.

It gives an array of strings with no duplications for parents and their children.

$('parent')
.find(":not(iframe)")
.addBack()
.contents()
.filter(function() {return this.nodeType == 3;})
//.map((i,v) => $(v).text()) // uncomment if you want strings

Ginaginder answered 3/12, 2022 at 17:44 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags