I'm having some difficulties with either how to construct the Solr query, or how to setup the schema to get searches in our web store to work better.
First some configuration (Solr 4.2.1)
<field name="mfgpartno" type="text_en_splitting_tight" indexed="true" stored="true" />
<field name="mfgpartno_sort" type="string" indexed="true" stored="false" />
<field name="mfgpartno_search" type="sku_partial" indexed="true" stored="true" />
<copyField source="mfgpartno" dest="mfgpartno_sort" />
<copyField source="mfgpartno" dest="mfgpartno_search" />
<fieldType name="sku_partial" class="solr.TextField" omitTermFreqAndPositions="true">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.WordDelimiterFilterFactory" preserveOriginal="1" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="1" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.TrimFilterFactory" />
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="false"/>
<filter class="solr.NGramFilterFactory" minGramSize="4" maxGramSize="100" side="front" />
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.TrimFilterFactory" />
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="false"/>
</analyzer>
</fieldType>
Let me break this down into stages (I'm only going to go into enough to replicate the problem - the initial stages aren't using edismax, that is what we've chosen to use on our website):
q=DV\-5PBRP
<- With this query I get 18 results but, not the one I'm looking for (this is most likely do to the defaultdf
searching on the productname field - fine)q=mfgpartno_search:DV\-5PBRP
<- this gives me the 1 result I'm looking for, but due to the query building I need to do on the website it's better if I can use theq
parameter like stage 1.q=DV\-5PBRP&defType=edismax&qf=mfgpartno_search
<- this also gives me the 1 result I'm looking for, but again due to the website searchqf
needs to be spanning more fields. Because it needs to search more fields (actualqf
=productname_search shortdesc_search fulldesc_search mfgpartno_search productname shortdesc fulldesc keywords
) to get more accurate searching I implemented stage 4.q=DV\-5PBRP&defType=edismax&qf=mfgpartno_search&q.op=AND
<- with this test I get 0 results - though this works great for most searches on our site.
My big problem with search has been the special characters like the dash that sometimes must be literal, and sometimes act as separators as in product names or descriptions. Sometimes people will even search or replace the dash with a space on a part number search and it should still show relevant data.
I'm kind of stuck on how to get this special character search working - especially as it pertains to this mfgpartno_search field. How might I configure either the schema or query (or both) to get this working?