Sunspot Solr Search like Rails active record 'LIKE' search
Asked Answered
D

2

5

Hi I have been using the normal rails active record LIKE search in my app, I started using sunspot solr search. I would like it to act as close to the rails LIKE search as possible.


wine.rb
#sunspot stuff
  searchable :auto_index => true, :auto_remove => true do
  text :name
end
#sunspot stuff


solr/conf/schema.xml
<fieldType name="text" class="solr.TextField" omitNorms="false">
  <analyzer>
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.StandardFilterFactory"/>
    <filter class="solr.LowerCaseFilterFactory"/>
  </analyzer>
</fieldType>


application_controller.rb
search_string = "will input some values here"
query = "%" + search_string + "%"

solr_search = Wine.search do
  fulltext search_string
end
@solr_search_results = solr_search.results.sort_by{|e| e[:id]}

@rails_search_results = Wine.find(:all, :conditions => ['wines.name LIKE ?' , query]).sort_by{|e| e[:id]}


search1
search_string = "grand"

@solr_search_results
186 Grand Reserve
688 Grand Plaisir Cabernet Sauvignon Shiraz Malbec Petit Verdot Cabernet Franc
760 Grand Vin Blanc
768 Grand Rouge
857 Premier Grand Cru
1067 Grand Classique
1584 Grand Vin De Glenelly
3389 Grand Constance Muscat
3708 Grand Cuvèe Brut
3857 Grand Constance Muscat

@rails_search_results
186 Grand Reserve
688 Grand Plaisir Cabernet Sauvignon Shiraz Malbec Petit Verdot Cabernet Franc
760 Grand Vin Blanc
768 Grand Rouge
857 Premier Grand Cru
969 River Grandeur Cape Blend
972 River Grandeur Cabernet Sauvignon
973 River Grandeur Chardonnay
974 River Grandeur Chenin Blanc
975 River Grandeur Pinotage
976 River Grandeur Sauvignon Blanc
977 River Grandeur Shiraz
978 River Grandeur Rose
1067 Grand Classique
1584 Grand Vin De Glenelly
3389 Grand Constance Muscat
3708 Grand Cuvèe Brut
3857 Grand Constance Muscat

Seems like Solr did not find the entries with 'Grandeur' in them, it only found the exact matches? How can I fix this?


search2
search_string = "rood"

@solr_search_results
200 Dassies Rood Cinsaut Cabernet Sauvignon Ruby Cabernet
3198 Dassies Rood Cinsaut Cabernet Sauvignon Ruby Cabernet
3394 Rood

@rails_search_results
200 Dassies Rood Cinsaut Cabernet Sauvignon Ruby Cabernet
483 Roodeberg Red Cabernet Sauvignon Shiraz Merlot
484 Roodeberg White Sauvignon Blanc Chardonnay Chenin Viognier
1113 Zevenrood
3044 Roodewal
3198 Dassies Rood Cinsaut Cabernet Sauvignon Ruby Cabernet
3394 Rood
3477 Roodeberg Red
3478 Roodeberg White
3594 Roodeberg White
3604 Roodeberg Red

Same thing happens here, solr did not find the 'Roodeberg' when search_string = 'rood', it only found the exact match


Update

Added nGram filter to schema.xml for partial matching, Thanks to DanS
But still not showing all results

app/solr/conf/schema.xml

<fieldType name="text" class="solr.TextField" omitNorms="false">

    <analyzer type="index">
        <tokenizer class="solr.LowerCaseTokenizerFactory"/>
        <filter class="solr.NGramFilterFactory" minGramSize="3" maxGramSize="15"/>
    </analyzer>

    <analyzer type="query">
        <tokenizer class="solr.LowerCaseTokenizerFactory"/>
    </analyzer>

</fieldType>



It still does not quite do what I want it to do, have a look at the following example

search3
search_string = "merl"

@solr_search_results
130 Merlot
202 Merlot
306 Merlot
336 Merlot
556 Merlot
579 Merlot
592 Merlot
623 Merlot
640 Merlot
689 Merlot
694 Merlot
714 Merlot
776 Merlot
790 Merlot
841 Merlot
865 Merlot
891 Merlot
947 Merlot
1015 Merlot
1045 Merlot
1046 Merlot
1073 Merlot
1075 Merlot
1089 Merlot
1096 Merlot
1111 Merlot
1121 Merlot
1144 Merlot
1145 Merlot
1169 Merlot

@rails_search_results
34 Cavalier Reserve Blend Merlot Cabernet Franc Cabernet Sauvignon Shiraz
129 Matt Black Cabernet Sauvignon Shiraz Merlot Petit Verdot Mourvedre Pinotage
130 Merlot
202 Merlot
240 Grappa Cabernet Merlot Premium
306 Merlot
336 Merlot
416 Dry Rosè Merlot
477 Orchestra Cabernet Sauvignon Malbec Merlot Cab Franc Shiraz
483 Roodeberg Red Cabernet Sauvignon Shiraz Merlot
556 Merlot
579 Merlot
592 Merlot
614 Cabernet Merlot
623 Merlot
640 Merlot
656 Calligraphy Merlot Cabernet Franc Sauvignon Blanc
672 Ondine Merlot
689 Merlot
694 Merlot
696 Barrel Select Merlot
714 Merlot
762 Private Collection Merlot
776 Merlot
790 Merlot
795 Private Collection Merlot
816 JJ Handmade Wines Merlot
832 Golden Triangle Merlot
841 Merlot
842 Merlot Rosé
854 Eagle Crest Cabernet Sauvignon Merlot
865 Merlot
877 Winemakers Choice Merlot Shiraz
891 Merlot
892 Merlot Reserve
893 Mountain Red Shiraz Merlot
941 Cellar Selection Merlot Cabernet Sauvignon
943 Vineyard Selection Cabertnet Sauvignon Merlot Cab Franc Shiraz
947 Merlot
982 Boet Erasmus Cabernet Sauvignon Merlot Malbec Petit Verdot
983 Cara Cabernet Sauvignon Shiraz Merlot
984 Classic Cabernet Sauvignon Shiraz Merlot
1010 Laureat Cabernet Sauvignon Merlot
1015 Merlot
1045 Merlot
1046 Merlot
1073 Merlot
1075 Merlot
1079 Cabernet/Merlot
1089 Merlot
1093 Adelberg Cabernet Sauvignon Merlot
1096 Merlot
1104 Z Collection Cabernet Franc Merlot Cabernet Sauvignon
1111 Merlot
1121 Merlot
1144 Merlot
1145 Merlot
1169 Merlot
1186 Merlot
1254 Cabernet Sauvignon Merlot
1260 Cabernet Sauvignon/Merlot
1261 Merlot
1269 Merlot
1326 Merlot
1349 Cabernet Sauvignon Merlot
1364 Cultivar Selection Merlot
1381 Merlot
1384 Cabernet Sauvignon Merlot
1393 Cabernet Sauvignon Merlot
1401 Cabernet Sauvignon Merlot
1404 Merlot
1421 Petit Cabernet Sauvignon Merlot
1424 Merlot
1431 Collection Merlot
1443 Merlot
1454 Merlot
1467 Poker Hill Shiraz Merlot
1468 Merlot
1476 Merlot
1491 Circumstance Merlot
1495 Peacock Ridge Merlot
1542 Merlot
1543 Merlot
1549 Merlot
1552 Merlot Reserve
1582 Unfiltered Merlot
1592 Merlot
3001 Merlot
3007 Cabernet Sauvignon Merlot
3036 Cabernet Sauvignon Merlot
3056 Merlot
3067 Kosher Merlot
3073 Organic Merlot
3079 Premium Merlot
3091 Merlot
3106 Merlot with a dash of Malbec
3133 Cabernet Sauvignon Merlot
3143 Five Climates Merlot
3154 Reserve No1 Merlot
3182 Lanoy Cabernet Sauvignon Merlot
3183 Reserve Collection Cab Sauv Merlot Cab Franc
3200 Merlot
3236 Giorgio Cabernet Sauvignon Merlot Petit Verdot Shiraz
3258 Danie De Wet Cabernet Sauvignon Merlot
3276 Red Cabernet Sauvignon Merlot Cab Franc Petit Verdot Shiraz
3288 Merlot
3303 Quartet Pinotage Cabernet Sauvignon Merlot Shiraz
3307 Diversity Merlot Malbec
3311 Vineyard Creations Merlot
3318 Caapmans Cabernet Sauvignon Merlot
3321 Luipaardsberg Merlot
3322 Merlot
3326 Rhinofields Merlot
3334 Merlot
3343 Merlot
3363 Merlot Cabernet Sauvignon
3372 Merlot
3390 Merlot
3416 R 62 Merlot Cabernet Sauvignon
3418 Unplugged 62 Merlot Rosé
3419 Unplugged 62 Merlot Shiraz
3431 Merlot
3439 KC Cabernet Sauvignon Merlot
3471 Orchestra Cabernet Sauvignon Malbec Merlot Cab Franc Shiraz
3497 Merlot
3498 Merlot Cabernet Sauvignon
3510 Merlot
3531 Merlot
3540 Merlot
3560 Merlot
3568 Merlot Rose
3578 Special Edition Merlot
3581 Merlot
3584 Cabernet Sauvignon Merlot
3624 Merlot
3642 Cellar Selecti on Merlot
3657 Merlot
3677 Merlot
3681 Merlot
3685 Series C Cabernet Sauvignon Merlot Cab Franc
3693 Merlot
3728 Alexanderfontein Merlot
3755 Peacock Ridge Merlot
3771 The Old Museum Merlot
3773 Cellar Selection Cabernet Sauvignon Merlot
3820 Merlot
3859 Merlot
3882 Dunstone Merlot
3900 Duckitt Merlot Cabernet Sauvignon
3919 Merlot
3947 Merlot

Dragster answered 15/5, 2012 at 13:33 Comment(0)
D
3

Update2

Fixed! Seems Solr shows only 30 entries by default, so it showed only the most relevant matches and skipped the other I also wanted.
I added this file

myapp/config/initializers/sunspot_solr.rb

Sunspot.config.pagination.default_per_page = 3000
Dragster answered 21/5, 2012 at 11:52 Comment(2)
you should use pagination instead, because you will always limited by that numberCatholicon
Model.search do paginate :page => params[:page], :per_page => 1 end, and of course your fulltext or whatever you use in there, then you can call with de page param or use kaminari for exampleCatholicon
P
3

Seems like you want to match substrings.

Add the following filter to your schema.xml to match any prefix of indexed text:

<filter class="solr.EdgeNGramFilterFactory"/>

Or the following to match arbitrary substrings:

<filter class="solr.NGramFilterFactory"/>

For more information, see the Solr wiki.

Preen answered 15/5, 2012 at 13:39 Comment(6)
It seems to work somewhat, but when I search for 'Dassies Ro' solr returns no results but rails returns -> 200 Dassies Rood Cinsaut Cabernet Sauvignon Ruby Cabernet, 201 Dassies Rose, 3198 Dassies Rood Cinsaut Cabernet Sauvignon Ruby Cabernet, 3199 Dassies RoséDragster
Try adjusting the min and max gram size: <filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="15" side="front"/>.Preen
I added nGram and EdgeNgram and it seems to work pretty good now, thanks!Dragster
You can remove EdgeNGram I think. NGram is what you want, as it matches substrings in the middle and not just the edges.Preen
when i do a serach for 'merl' it only shows a couple results, do I increase the maxGramSize ?Dragster
Do you have to reindex after adding new filters to schema.xml? Or just restart solr?Anticipate
D
3

Update2

Fixed! Seems Solr shows only 30 entries by default, so it showed only the most relevant matches and skipped the other I also wanted.
I added this file

myapp/config/initializers/sunspot_solr.rb

Sunspot.config.pagination.default_per_page = 3000
Dragster answered 21/5, 2012 at 11:52 Comment(2)
you should use pagination instead, because you will always limited by that numberCatholicon
Model.search do paginate :page => params[:page], :per_page => 1 end, and of course your fulltext or whatever you use in there, then you can call with de page param or use kaminari for exampleCatholicon

© 2022 - 2024 — McMap. All rights reserved.