How do I do a partial field match using Haystack?
Asked Answered
G

5

22

I needed a simple search tool for my django-powered web site, so I went with Haystack and Solr. I have set everything up correctly and can find the correct search results when I type in the exact phrase, but I can't get any results when typing in a partial phrase.

For example: "John" returns "John Doe" but "Joh" doesn't return anything.

Model:

class Person(models.Model):
    first_name = models.CharField(max_length=50)
    last_name = models.CharField(max_length=50)

Search Index:

class PersonIndex(SearchIndex):
    text = CharField(document=True, use_template=True)
    first_name = CharField(model_attr = 'first_name')
    last_name = CharField(model_attr = 'last_name')

site.register(Person, PersonIndex)

I'm guessing there's some setting I'm missing that enables partial field matching. I've seen people talking about EdgeNGramFilterFactory() in some forums, and I've Googled it, but I'm not quite sure of its implementation. Plus, I was hoping there was a haystack-specific way of doing it in case I ever switch out the search backend.

Genniegennifer answered 8/12, 2010 at 19:51 Comment(0)
C
16

You can achieve that behavior by making your index's text field an EdgeNgramField:

class PersonIndex(SearchIndex):
    text = EdgeNgramField(document=True, use_template=True)
    first_name = CharField(model_attr = 'first_name')
    last_name = CharField(model_attr = 'last_name')
Carew answered 18/4, 2013 at 12:33 Comment(2)
I'm using elasticsearch & haystack and this do the trick of partial match perfectly, saving myself of some hours of elasticsearch configurationElectroscope
@Liarez how did you get this to work? I'm using haystack/elastic search and I wasn't able to get it to work.Alternant
A
3

In addition to the EdgeNgramField hint that others mentioned in this page (and of course NgramField, if you work with Asian languages), I think it is worth to mention that in Django_haystack you can run raw queries on Solr via following command:

from haystack.query import SearchQuerySet
from haystack.inputs import Raw
SearchQuerySet().filter(text=Raw(query))

where text is the field you want to search, and the query can be anything based on Query Parser Syntax (version 3.6, or 4.6) of Lucene.

In this way you can easily set the query to ABC* or ABC~ or anything else which fits to the syntax.

Arvy answered 22/5, 2013 at 19:2 Comment(0)
O
1

I had a similar issue while searching for non english words, for instance:

ABC
ABCD

If I want to search for keywords ABC, I will expect the above two results. I was able to achieve the following by converting the keyword to lowercase and using startswith:

keywords = 'ABC'
results.filter(code__startswith=keywords.lower())
Overhaul answered 31/3, 2011 at 20:43 Comment(1)
Of course it won't, the case I illustrated was to search for prefixes only.Overhaul
V
1

I had the same problem and the only way to get the results I wanted was to modify the solr configuration file to include ngram filtering as the default tokenizer is based on white space. So use NGramTokenizer instead. I'd love to know if there was a haystack way of doing the same thing.

I'm not at my machine right now but this should do the trick.

<tokenizer class="solr.NGramTokenizerFactory" minGramSize="3" maxGramSize="15" />
Vullo answered 14/6, 2011 at 22:1 Comment(0)
K
0

@riz I can't comment yet or I would and I know it's an old comment but in case anyone else runs past this: Make sure to manage.py update_index

Blockquote @Liarez how did you get this to work? I'm using haystack/elastic search and I wasn't able to get it to work.

Kuehl answered 21/5, 2015 at 2:44 Comment(1)
Update didn't work for me but rebuild_index did. Watch out if your index is big!Pipes

© 2022 - 2024 — McMap. All rights reserved.