Search keyword highlight in ASP.Net
Asked Answered
L

4

7

I am outputting a list of search results for a given string of keywords, and I want any matching keywords in my search results to be highlighted. Each word should be wrapped in a span or similar. I am looking for an efficient function to do this.

E.g.

Keywords: "lorem ipsum"

Result: "Some text containing lorem and ipsum"

Desired HTML output: "Some text containing <span class="hit">lorem</span> and <span class="hit">ipsum</span>"

My results are case insensitive.

Leann answered 27/10, 2009 at 10:31 Comment(0)
L
14

Here's what I've decided on. An extension function that I can call on the relevant strings within my page / section of my page:

public static string HighlightKeywords(this string input, string keywords)
{
    if (input == string.Empty || keywords == string.Empty)
    {
        return input;
    }

    string[] sKeywords = keywords.Split(' ');
    foreach (string sKeyword in sKeywords)
    {
        try
        {
            input = Regex.Replace(input, sKeyword, string.Format("<span class=\"hit\">{0}</span>", "$0"), RegexOptions.IgnoreCase);
        }
        catch
        {
            //
        }
    }
    return input;
}

Any further suggestions or comments?

Leann answered 27/10, 2009 at 14:48 Comment(5)
First, this is going to recognise partial matches within words. Your regex needs to be doing whole word replacements only. Secondly, you can enter ' ' instead of Convert.ToChar(" ")Rattray
Thanks Richard - good tip for char, I knew there must be a better way but it hadn't clicked. RE partial matches, that's what I'm after in this case, as the search uses wildcards (hence the need to make things clearer with highlighting).Leann
I'm not sure but there are javascript files for text highlighting. Ex: eggheadcafe.com/articles/highlight_google_keywords.aspNeogothic
This looks pretty much like the solution I just wrote to my project. I found a problem if I searched on more than 1 word and the last word were either span, class or hit. That will screw up things bad. I tried to seach for a better solution and found this, so I want to give people a heads up what can go bad if doing like this.Lynnett
What does adding the {0} and "$0" to string.Format() do? I understand it helps keep the case that's in input instead of sKeyword in the returned string, which works flawlessly in my case, but what exactly does it do?Ectogenous
S
1

try highlighter from Lucene.net

http://incubator.apache.org/lucene.net/docs/2.0/Highlighter.Net/Lucene.Net.Highlight.html

How to use:

http://davidpodhola.blogspot.com/2008/02/how-to-highlight-phrase-on-results-from.html

EDIT: As long as Lucene.net highlighter is not suitable here new link:

http://mhinze.com/archive/search-term-highlighter-httpmodule/

Shalondashalt answered 27/10, 2009 at 10:40 Comment(5)
Looks good, but do I have to be using Lucene.Net for my search results to use the Lucene highligher functions? I'm actually just using a simple stored procedure (the data is only in one table so I don't want to have to build and maintain a separate Lucene index).Leann
Here you can find sources svn.apache.org/repos/asf/incubator/lucene.net/trunk/C%23/…. Possibly it will help you make a decisionShalondashalt
Hmm. Look like you can use it only with Lucene. (( But may be you can use some code from this project...Shalondashalt
first and last links are downThermoluminescence
@ssg yes, there is new version of lucene itself and also new documntation layout. current link for this class is incubator.apache.org/lucene.net/docs/2.9.4/html/…. But this is not a permanent linkShalondashalt
P
1

Use the jquery highlight plugin.

For highlighting it at server side

protected override void Render( HtmlTextWriter writer )
{
    StringBuilder html = new StringBuilder();
    HtmlTextWriter w = new HtmlTextWriter( new StringWriter( html ) );

    base.Render( w );

    html.Replace( "lorem", "<span class=\"hit\">lorem</span>" );

    writer.Write( html.ToString() );
}

You can use regular expressions for advanced text replacing.

You can also write the above code in an HttpModule so that it can be re used in other applications.

Precipitin answered 27/10, 2009 at 10:42 Comment(1)
Thanks for the idea - In this instance I'm trying to do this server side, as it needs to work on a variety of non-JavaScript devices.Leann
T
0

An extension to the answer above. (don't have enough reputation to give comment)

To avoid span from being replaced when search criteria were [span pan an a], the found word was replaced to something else than replace back... not very efficient though...

public string Highlight(string input)
{
    if (input == string.Empty || searchQuery == string.Empty)
    {
        return input;
    }

    string[] sKeywords = searchQuery.Replace("~",String.Empty).Replace("  "," ").Trim().Split(' ');
    int totalCount = sKeywords.Length + 1;
    string[] sHighlights = new string[totalCount];
    int count = 0;

    input = Regex.Replace(input, Regex.Escape(searchQuery.Trim()), string.Format("~{0}~", count), RegexOptions.IgnoreCase);
    sHighlights[count] = string.Format("<span class=\"highlight\">{0}</span>", searchQuery);
    foreach (string sKeyword in sKeywords.OrderByDescending(s => s.Length))
    {
        count++;
        input = Regex.Replace(input, Regex.Escape(sKeyword), string.Format("~{0}~", count), RegexOptions.IgnoreCase);
        sHighlights[count] = string.Format("<span class=\"highlight\">{0}</span>", sKeyword);
    }

    for (int i = totalCount - 1; i >= 0; i--)
    {
        input = Regex.Replace(input, "\\~" + i + "\\~", sHighlights[i], RegexOptions.IgnoreCase);
    }

    return input;
}
Tyree answered 13/12, 2013 at 5:54 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.