Highlight search terms (select only leaf nodes)
Asked Answered
T

7

4

I would like to highlight search terms on a page, but not mess with any HTML tags. I was thinking of something like:

$('.searchResult *').each(function() {
    $(this.html($(this).html().replace(new RegExp('(term)', 'gi'), '<span class="highlight">$1</span>'));
)};

However, $('.searchResult *').each matches all elements, not just leaf nodes. In other words, some of the elements matched have HTML inside them. So I have a few questions:

  1. How can I match only leaf nodes?
  2. Is there some built-in jQuery RegEx function to simplify things? Something like: $(this).wrap('term', $('<span />', { 'class': 'highlight' }))
  3. Is there a way to do a simple string replace and not a RegEx?
  4. Any other better/faster way of doing this?

Thanks so much!

Tanager answered 13/7, 2010 at 20:21 Comment(1)
You can use e.g. mark.jsKulsrud
H
8

[See it in action]

// escape by Colin Snover
// Note: if you don't care for (), you can remove it..
RegExp.escape = function(text) {
    return text.replace(/[-[\]{}()*+?.,\\^$|#\s]/g, "\\$&");
}

function highlight(term, base) {
  if (!term) return;
  base = base || document.body;
  var re = new RegExp("(" + RegExp.escape(term) + ")", "gi"); //... just use term
  var replacement = "<span class='highlight'>" + term + "</span>";
  $("*", base).contents().each( function(i, el) {
    if (el.nodeType === 3) {
      var data = el.data;
      if (data = data.replace(re, replacement)) {
        var wrapper = $("<span>").html(data);
        $(el).before(wrapper.contents()).remove();
      }
    }
  });
}

function dehighlight(term, base) {
  var text = document.createTextNode(term);
  $('span.highlight', base).each(function () {
    this.parentNode.replaceChild(text.cloneNode(false), this);
  });
}
Hangout answered 13/7, 2010 at 20:21 Comment(6)
The See it in action example isn't working for me. However, I had forgotten about the :contains selector which should help with selecting the "leaf" nodes and not doing a replace unnecessarily. I'll give this a try.Tanager
I'm guessing it would be more efficient to create the RegExp variable once before the each and reuse it inside each?Tanager
Yes it will be precompiled and should be faster. Check the link now :)Placatory
@Nelson - contains would do a case insensitive search, so it might be better to do a regex search on text() instead. Also, any solution where html is being overwritten suffers from two problems - existing behavior such as events will get overwritten, and the search term may collide with the html. See jsfiddle.net/BcsQG/1Mano
using the "highlight(term, base) {}" function on ajax response data... I had to add this just before the closing brace: ` return base;`, to make it work. Otherwise there was no output.Desexualize
Awesome! One problem -- while this will highlight matches with different case, it will lowercase them. You can fix it by changing the replacement to "<span class='highlight'>$1</span>".Aftmost
M
3

Use contents()1, 2, 3 to get all nodes including text nodes, filter out the non-text nodes, and finally replace the nodeValue of each remaining text node using regex. This would keep the html nodes intact, and only modify the text nodes. You have to use regex instead of simple string substitutions as unfortunately we cannot do global replacements when the search term is a string.

function highlight(term) {
    var regex = new RegExp("(" + term + ")", "gi");
    var localRegex = new RegExp("(" + term + ")", "i");
    var replace = '<span class="highlight">$1</span>';

    $('body *').contents().each(function() {
        // skip all non-text nodes, and text nodes that don't contain term
        if(this.nodeType != 3 || !localRegex.test(this.nodeValue)) {
            return;
        }
        // replace text node with new node(s)
        var wrapped = $('<div>').append(this.nodeValue.replace(regex, replace));
        $(this).before(wrapped.contents()).remove();
    });
}

We can't make it a one-liner and much shorter easily now, so I prefer it like this :)

See example here.

Mano answered 13/7, 2010 at 20:33 Comment(5)
dis is buggy, we can't set the nodeValue of a text node and hope it will work :). have to replace the text node with a span element.Mano
Fixed the bugs, now only does text node replacements. Does not replace the entire html.Mano
it will fail for things like (this)Placatory
@Hangout - could you elaborate more on why (this) would be a breaking input?Mano
ah I see, thanks for pointing that out. I am tempted to use your RegExp.escape solution, but will let this bug pass instead :)Mano
P
2

I'd give the Highlight jQuery plugin a shot.

Plowshare answered 13/7, 2010 at 21:12 Comment(3)
I saw that, but it does a temporary highlight. I need to keep the terms highlighted. Also, the fade effect probably wouldn't be a good idea with dozens or hundreds of matches on a page.Tanager
You may have clicked it before I edited it. I had the wrong link originally. The one in jQueryUI is indeed temporary, but the one on johannburkard.de is permanent until you call removeHighlight(), and doesn't have a fade effect.Plowshare
The new link works as expected. I may end up using this in the end, but galambalazs answered my questions more directly.Tanager
B
2

I've made a pure JavaScript version of this, and packaged it into a Google Chrome plug-in, which I wish to be helpful to some people. The core function is shown below:

GitHub Page for In-page Highlighter

function highlight(term){
    if(!term){
        return false;
    }

    //use treeWalker to find all text nodes that match selection
    //supported by Chrome(1.0+)
    //see more at https://developer.mozilla.org/en-US/docs/Web/API/TreeWalker
    var treeWalker = document.createTreeWalker(
        document.body,
        NodeFilter.SHOW_TEXT,
        null,
        false
        );
    var node = null;
    var matches = [];
    while(node = treeWalker.nextNode()){
        if(node.nodeType === 3 && node.data.indexOf(term) !== -1){
            matches.push(node);
        }
    }

    //deal with those matched text nodes
    for(var i=0; i<matches.length; i++){
        node = matches[i];
        //empty the parent node
        var parent = node.parentNode;
        if(!parent){
            parent = node;
            parent.nodeValue = '';
        }
        //prevent duplicate highlighting
        else if(parent.className == "highlight"){
            continue;
        }
        else{
            while(parent && parent.firstChild){
                parent.removeChild(parent.firstChild);
            }
        }

        //find every occurance using split function
        var parts = node.data.split(new RegExp('('+term+')'));
        for(var j=0; j<parts.length; j++){
            var part = parts[j];
            //continue if it's empty
            if(!part){
                continue;
            }
            //create new element node to wrap selection
            else if(part == term){
                var newNode = document.createElement("span");
                newNode.className = "highlight";
                newNode.innerText = part;
                parent.appendChild(newNode);
            }
            //create new text node to place remaining text
            else{
                var newTextNode = document.createTextNode(part);
                parent.appendChild(newTextNode);
            }
        }

    }
}
Bakken answered 10/9, 2013 at 21:59 Comment(1)
Hey, could you walk me through the process. Why did you clear the parent node?Dependable
E
1

I spent hours searching the web for code that could highlight search terms as the user types, and none could do what I wanted until I combined a bunch of stuff together to do this (jsfiddle demo here):

$.fn.replaceText = function(search, replace, text_only) {
    //https://mcmap.net/q/752952/-how-do-i-use-jquery-to-replace-all-occurring-of-a-certain-word-in-a-webpage
    return this.each(function(){  
        var v1, v2, rem = [];
        $(this).find("*").andSelf().contents().each(function(){
            if(this.nodeType === 3) {
                v1 = this.nodeValue;
                v2 = v1.replace(search, replace);
                if(v1 != v2) {
                    if(!text_only && /<.*>/.test(v2)) {  
                        $(this).before( v2 );  
                        rem.push(this);  
                    } else {
                        this.nodeValue = v2;  
                    }
                }
            }
        });
        if(rem.length) {
            $(rem).remove();
        }
    });
};

function replaceParentsWithChildren(parentElements){
    parentElements.each(function() {
        var parent = this;
        var grandparent = parent.parentNode;
        $(parent).replaceWith(parent.childNodes);
        grandparent.normalize();//merge adjacent text nodes
    });
}

function highlightQuery(query, highlightClass, targetSelector, selectorToExclude){
    replaceParentsWithChildren($('.' + highlightClass));//Remove old highlight wrappers.
    $(targetSelector).replaceText(new RegExp(query, "gi"), function(match) {
        return '<span class="' + highlightClass + '">' + match + "</span>";
    }, false);
    replaceParentsWithChildren($(selectorToExclude + ' .' + highlightClass));//Do not highlight children of this selector.
}
Erythropoiesis answered 4/3, 2013 at 22:55 Comment(0)
M
0

Here's a naive implementation that just blasts in HTML for any match:

<!DOCTYPE html>
<html lang"en">
<head>
    <title>Select Me</title>
    <style>
        .highlight {
            background:#FF0;
        }
    </style>
    <script type="text/javascript" src="http://ajax.microsoft.com/ajax/jquery/jquery-1.4.2.min.js"></script>
    <script type="text/javascript">

        $(function () {

            hightlightKeyword('adipisicing');

        });

        function hightlightKeyword(keyword) {

            var replacement = '<span class="highlight">' + keyword + '</span>';
            var search = new RegExp(keyword, "gi");
            var newHtml = $('body').html().replace(search, replacement);
            $('body').html(newHtml);
        }

    </script>
</head>
<body>
    <div>

        <p>Lorem ipsum dolor sit amet, consectetur <b>adipisicing</b> elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.</p>
        <p>Lorem ipsum dolor sit amet, <em>consectetur adipisicing elit</em>, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.</p>
        <p>Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.</p>

    </div>
</body>
</html>
Mandle answered 13/7, 2010 at 21:6 Comment(1)
Yeah, the problem with that is it can match on html tags. If the keyword is p (paragraph), your HTML is mangled.Tanager
M
0

My reputation is not high enough for a comment or adding more links, so I am sorry to write a new answer without all references.

I was interested in the performance of the mentioned solutions above and added some code for measurement. To keep it simple I added only these lines:

var start = new Date();
// hightlighting code goes here ...
var end = new Date();
var ms = end.getTime() - start.getTime();
jQuery("#time-ms").text(ms);

I have forked the solution of Anurag with these lines and this resulted in 40-60ms in average.

So I forked this fiddle and made some improvements to fit my needs. One thing was the RegEx-escaping (plz see the answer from CoolAJ86 in "escape-string-for-use-in-javascript-regex" in stackoverflow). Another point was the prevention of a second 'new RegExp()', as the RegExp.test-function should ignore the global flag and return on the first matching (plz see javascript reference on RegExp.test).

On my machine (chromium, linux) I have runtimes about 30-50ms. You can test this by yourself in this jsfiddle.

I also added my timers to the highest rated solution of galambalazs, you can find this in this jsFiddle. But this one has runtimes of 60-100ms.

The values in milliseconds become even higher and of much more importance when running (e.g. in Firefox about a quarter of a second).

Maldon answered 18/12, 2013 at 13:16 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.