Why is postgres trigram word_similarity function not using a gin index?
Asked Answered
B

1

4

The postgres trigram documentation states:

The pg_trgm module provides GiST and GIN index operator classes that allow you to create an index over a text column for the purpose of very fast similarity searches. These index types support the above-described similarity operators, and additionally support trigram-based index searches for LIKE, ILIKE, ~ and ~* queries.

and shows the following example:

SELECT t, word_similarity('word', t) AS sml
  FROM test_trgm
  WHERE 'word' <% t
  ORDER BY sml DESC, t;

Awesome!

However, when running the following query:

SELECT * 
FROM place 
WHERE word_similarity(place.name, '__SOME_STRING__') > 0.5

The index that was created is not being used.

However, when using ILIKE or the %> operators, it does seem that the index is being used. Why is the index not used on the word_similarity function?

Baobaobab answered 11/2, 2020 at 21:12 Comment(0)
B
6

According to this postgres forum response

PostgreSQL doesn't use index scan with functions within WHERE clause. So you always need to use operators instead. You can try <% operator and pg_trgm.word_similarity_threshold variable:

=# SET pg_trgm.word_similarity_threshold TO 0.1;

=# SELECT name, popularity FROM temp.items3_v ,(values ('some phrase'::text)) consts(input) WHERE input <% name ORDER BY 2, input <<-> name;

So, the query can be updated to use the index as follows:

SET pg_trgm.word_similarity_threshold TO 0.1;
SELECT * 
FROM place 
WHERE place.name <<-> '__SOME_STRING__';

Warning: the operator only uses the index with only one version of the commutator pair. I.e., it only used the index in the case <<-> and not the case <->>. This stack overflow q/a post looks like it gives a reasonable explanation as to why:

These are different operations, and only one of them is supported by the index.

Baobaobab answered 11/2, 2020 at 21:12 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.