How to use a tsvector field to perform ranking in Django with postgresql full-text search?
Asked Answered
F

2

12

I need to perform a ranking query using postgresql full-text search feature and Django with django.contrib.postgres module.

According to the doc, it is quite easy to do this using the SearchRank class by doing the following:

>>> from django.contrib.postgres.search import SearchQuery, SearchRank, SearchVector
>>> vector = SearchVector('body_text')
>>> query = SearchQuery('cheese')
>>> Entry.objects.annotate(rank=SearchRank(vector, query)).order_by('-rank')

This probably works well but this is not exactly what I want since I have a field in my table which already contains tsvectorized data that I would like to use (instead of recomputing tsvector at each search query).

Unforunately, I can't figure out how to provide this tsvector field to the SearchRank class instead of a SearchVector object on a raw data field.

Is anyone able to indicate how to deal with this?

Edit: Of course, simply trying to instantiate a SearchVector from the tsvector field does not work and fails with this error (approximately since I translated it from french):

django.db.utils.ProgrammingError: ERROR: function to_tsvector(tsvector) does not exist

Finite answered 6/1, 2017 at 14:32 Comment(0)
B
18

If your model has a SearchVectorField like so:

from django.contrib.postgres.search import SearchVectorField

class Entry(models.Model):
    ...
    search_vector = SearchVectorField()

you would use the F expression:

from django.db.models import F

...
Entry.objects.annotate(
    rank=SearchRank(F('search_vector'), query)
).order_by('-rank')
Bourn answered 18/1, 2017 at 16:53 Comment(3)
I did not know the Query Expressions... Thanks for the answer, it works perfectly.Finite
Is it possible to add a weight to the vector? docs.djangoproject.com/en/2.2/ref/contrib/postgres/search/…Witling
When you update the field, add the weights there: SearchVector("your_field", weight='A', config='english') + SearchVector("your_other_field", weight='B', config='english')Alinealinna
A
0

I've been seeing mixed answers here on SO and in the official documentation. F Expressions aren't used in the documentation for this. However it may just be that the documentation doesn't actually provide an example for using SearchRank with a SearchVectorField.

Looking at the output of .explain(analyze=True) :

Without the F Expression:

Sort Key: (ts_rank(to_tsvector(COALESCE((search_vector)::text, ''::text)) 

When the F Expression is used:

Sort Key: (ts_rank(search_vector, ...) 

In my experience, it seems the only difference between using an F Expression and the field name in quotes is that using the F Expression returns much faster, but is sometimes less accurate - depending on how you structure the query - it can be useful to enforce it with a COALESCE in some cases. In my case it's about a 3-5x speedboost to use the F Expression with my SearchVectorField.

Ensuring your SearchQuery has a config kwarg also improves things dramatically.

Alinealinna answered 25/12, 2021 at 23:53 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.