Wildcard searching and highlighting with Solr 1.4
Asked Answered
B

2

7

I've got a pretty much vanilla install of SOLR 1.4 apart from a few small config and schema changes.

<requestHandler name="standard" class="solr.SearchHandler" default="true">
    <!-- default values for query parameters -->
    <lst name="defaults">
        <str name="defType">dismax</str>
        <str name="echoParams">explicit</str>
        <str name="qf">
            text
        </str>
        <str name="spellcheck.dictionary">default</str>
        <str name="spellcheck.onlyMorePopular">false</str>
        <str name="spellcheck.extendedResults">false</str>
        <str name="spellcheck.count">1</str>
    </lst>
</requestHandler>

The main field type I'm using for Indexing is this:

<fieldType name="textNoHTML" class="solr.TextField" positionIncrementGap="100">
        <analyzer type="index">
            <charFilter class="solr.HTMLStripCharFilterFactory" />
            <tokenizer class="solr.WhitespaceTokenizerFactory"/>
            <filter class="solr.StopFilterFactory"
                    ignoreCase="true"
                    words="stopwords.txt"
                    enablePositionIncrements="true"
            />
            <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
            <filter class="solr.LowerCaseFilterFactory"/>
            <filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>
        </analyzer>
        <analyzer type="query">
            <tokenizer class="solr.WhitespaceTokenizerFactory"/>
            <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
            <filter class="solr.StopFilterFactory"
                    ignoreCase="true"
                    words="stopwords.txt"
                    enablePositionIncrements="true"
            />
            <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
            <filter class="solr.LowerCaseFilterFactory"/>
            <filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>
        </analyzer>
    </fieldType>

now, when I perform a search using

"q=search+term&hl=on"

I get highlighting, and nice accurate scores.

BUT, for wildcard, I'm assuming you need to use "q.alt"? Is that true? If so my query looks like this:

"q.alt=search*&hl=on"

When I use the above query, highlighting doesn't work, and all the scores are "1.0".

What am I doing wrong? is what I want possible without bypassing some of the really cool SOLR optimizations.

cheers!

Bimetallic answered 10/3, 2010 at 1:35 Comment(2)
Some info I found about this: old.nabble.com/Wildcard-on-q.alt-with-Dismax-td17722791.html mail-archive.com/[email protected]/msg21518.html however it would seem that they were fixed for 1.4. I'll keep looking...Hostess
cool, cheers Mauricio. I've found quite a lot of info on this topic, but the discussions never address what parameters i need to use, or if i can still use highlighting, scoring, spellchecking etc. cheers thoughBimetallic
L
8

From what I know you can't use wildcards with the dismax handler, see http://wiki.apache.org/solr/DisMaxRequestHandler#q.

To simulate wildcard searching I used EdgeNGrams following some of the instructions here: http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/. Actually I really only added the edgytext fieldtype to schema.xml and changed the fieldtype of the field I wanted to search.

Hope this helps!

Lacedaemonian answered 17/5, 2010 at 15:5 Comment(1)
Glad I could help! I was quite frustrated myself :)Lacedaemonian
M
5

Or you can grab the latest nightly build and use edismax (ExtendedDismaxQParser).

It handles both trailing and leading wildcards.

Millenarianism answered 23/5, 2010 at 22:48 Comment(1)
cool, thanks Jem, I'll check that out. By the way, are you on the Solr mailing list forum thing? It would be good if Solr could make SO they're official Q&A place... those mailing lists are really unintuitiveBimetallic

© 2022 - 2024 — McMap. All rights reserved.