PostgreSQL LIKE query performance variations
Asked Answered
T

8

160

I have been seeing quite a large variation in response times regarding LIKE queries to a particular table in my database. Sometimes I will get results within 200-400 ms (very acceptable) but other times it might take as much as 30 seconds to return results.

I understand that LIKE queries are very resource intensive but I just don't understand why there would be such a large difference in response times. I have built a btree index on the owner1 field but I don't think it helps with LIKE queries. Anyone have any ideas?

Sample SQL:

SELECT gid, owner1 FORM parcels
WHERE owner1 ILIKE '%someones name%' LIMIT 10

I've also tried:

SELECT gid, owner1 FROM parcels
WHERE lower(owner1) LIKE lower('%someones name%') LIMIT 10

And:

SELECT gid, owner1 FROM parcels
WHERE lower(owner1) LIKE lower('someones name%') LIMIT 10

With similar results.
Table Row Count: about 95,000.

Terle answered 14/10, 2009 at 14:37 Comment(0)
R
390

FTS does not support LIKE

The previously accepted answer was incorrect. Full Text Search with its full text indexes is not for the LIKE operator at all, it has its own operators and doesn't work for arbitrary strings. It operates on words based on dictionaries and stemming. It does support prefix matching for words, but not with the LIKE operator:

Trigram index for LIKE

Install the additional module pg_trgm which provides operator classes for GIN and GiST trigram indexes to support all LIKE and ILIKE patterns, not just left-anchored ones:

Example index:

CREATE INDEX tbl_col_gin_trgm_idx  ON tbl USING gin  (col gin_trgm_ops);

Or:

CREATE INDEX tbl_col_gist_trgm_idx ON tbl USING gist (col gist_trgm_ops);

Example query:

SELECT * FROM tbl WHERE col LIKE 'foo%';
SELECT * FROM tbl WHERE col LIKE '%foo%';   -- works with leading wildcard, too
SELECT * FROM tbl WHERE col ILIKE '%foo%';  -- works case insensitively as well

Trigrams? What about shorter strings?

Words with less than 3 letters in indexed values still work. The manual:

Each word is considered to have two spaces prefixed and one space suffixed when determining the set of trigrams contained in the string.

And search patterns with less than 3 letters? The manual:

For both LIKE and regular-expression searches, keep in mind that a pattern with no extractable trigrams will degenerate to a full-index scan.

Meaning, that index / bitmap index scans still work (query plans for prepared statement won't break), it just won't buy you better performance. Typically no big loss, since 1- or 2-letter strings are hardly selective (more than a few percent of the underlying table matches) and index support would not improve performance (much) to begin with, because a full table scan is faster.

Prefix matching

Search patterns with no leading wildcard: col LIKE 'foo%'.

^@ operator / starts_with() function

Quoting the release notes of Postgres 11:

Add prefix-match operator text ^@ text, which is supported by SP-GiST (Ildus Kurbangaliev)

This is similar to using var LIKE 'word%' with a btree index, but it is more efficient.

Example query:

SELECT * FROM tbl WHERE col ^@ 'foo';  -- no added wildcard

But the potential of operator and function stays limited until planner support is improved in Postgres 15 and the ^@ operator is documented properly. The release notes:

Allow the ^@ starts-with operator and the starts_with() function to use btree indexes if using the C collation (Tom Lane)

Previously these could only use SP-GiST indexes.

COLLATE "C"

Since Postgres 9.1, an index with COLLATE "C" provides the same functionality as the operator class text_pattern_ops described below. See:

Example index:

CREATE INDEX tbl_col_text_collate_c_idx ON tbl(col COLLATE "C");

text_pattern_ops (original answer)

For just left-anchored patterns (no leading wildcard) you get the optimum with a suitable operator class for a btree index: text_pattern_ops or varchar_pattern_ops. Both built-in features of standard Postgres, no additional module needed. Similar performance, but much smaller index.

Example index:

CREATE INDEX tbl_col_text_pattern_ops_idx ON tbl(col text_pattern_ops);

Example query:

SELECT * FROM tbl WHERE col LIKE 'foo%';  -- no leading wildcard

Or, if you should be running your database with the 'C' locale (effectively no locale), then everything is sorted according to byte order anyway and a plain btree index with default operator class does the job.

Further reading

Riffraff answered 19/11, 2012 at 11:8 Comment(10)
With no leading wildcard on a table of 500K lines, gin index with gin_trgm_ops appears being 10 times faster than btreeBorneol
@nicolas: The comparison depends on many variables. Key length, data distribution, pattern length, possible index only scan ... And most importantly: Postgres version. GIN indexes have been improved substantially in pg 9.4 and 9.5. An the new version of pg_trgm (to be released with pg 9.6) is going to bring more improvements.Riffraff
If I got the docs right, with pg_trgm you need query string of at least 3 characters in length, for example fo% would not hit index but do a scan instead. Something to note.Digenesis
@TuukkaMustonen: Good point. Well, (bitmap) index scans still work, they just won't buy you better performance. I added some clarification above.Riffraff
Could the ^@ operator and the starts_with() string function get the same advantage as the left-anchored LIKE?Theocentric
@КонстантинВан: They actually do with Postgres 15 (currently RC). Consider updates to my answer.Riffraff
and for suffix in (LIKE '%blabla';) what is the best index to use ?Billionaire
@Billionaire A trigram index also covers that. Or, more efficiently for just this case, an expression index on the reversed string column. See: https://mcmap.net/q/28751/-search-with-additional-words-at-the-beginning or https://mcmap.net/q/28752/-full-text-search-for-compound-words. Or start a new question with your specifics.Riffraff
Ok, I see. I wonder if using LOWER before REVERSE will help improve query performance ? And how ?Billionaire
Thanks for your answer. I've posted a related question: https://mcmap.net/q/28753/-how-to-index-a-column-for-leading-wildcard-search-and-check-progress/470749Greensand
S
11

Possibly the fast ones are anchored patterns with case-sensitive like that can use indexes. i.e. there is no wild card at the beginning of the match string so the executor can use an index range scan. (the relevant comment in the docs is here) Lower and ilike will also lose your ability to use the index unless you specifically create an index for that purpose (see functional indexes).

If you want to search for string in the middle of the field, you should look into full text or trigram indexes. First of them is in Postgres core, the other is available in the contrib modules.

Samsun answered 14/10, 2009 at 14:43 Comment(1)
I hadn't thought about creating an index on the lowercase value of the field. That way I can convert the query text to lowercase on the backend before querying.Terle
P
7

I recently had a similar issue with a table containing 200000 records and I need to do repeated LIKE queries. In my case, the string being search was fixed. Other fields varied. Because that, I was able to rewrite:

SELECT owner1 FROM parcels
WHERE lower(owner1) LIKE lower('%someones name%');

as

CREATE INDEX ix_parcels ON parcels(position(lower('someones name') in lower(owner1)));

SELECT owner1 FROM parcels
WHERE position(lower('someones name') in lower(owner1)) > 0;

I was delighted when the queries came back fast and verified the index is being used with EXPLAIN ANALYZE:

 Bitmap Heap Scan on parcels  (cost=7.66..25.59 rows=453 width=32) (actual time=0.006..0.006 rows=0 loops=1)
   Recheck Cond: ("position"(lower(owner1), 'someones name'::text) > 0)
   ->  Bitmap Index Scan on ix_parcels  (cost=0.00..7.55 rows=453 width=0) (actual time=0.004..0.004 rows=0 loops=1)
         Index Cond: ("position"(lower(owner1), 'someones name'::text) > 0)
 Planning time: 0.075 ms
 Execution time: 0.025 ms
Prisilla answered 21/11, 2018 at 23:10 Comment(0)
C
4

You could install Wildspeed, a different type of index in PostgreSQL. Wildspeed does work with %word% wildcards, no problem. The downside is the size of the index, this can be large, very large.

Cantilever answered 25/2, 2010 at 21:36 Comment(0)
B
4

When ever you use a clause on a column with functions eg LIKE, ILIKE, upper, lower etc. Then postgres wont take your normal index into consideration. It will do a full scan of the table going through each row and therefore it will be slow.

The correct way would be to create a new index according to your query. For example if i want to match a column without case sensitivity and my column is a varchar. Then you can do it like this.

create index ix_tblname_col_upper on tblname (UPPER(col) varchar_pattern_ops);

Similarly if your column is a text then you do something like this

create index ix_tblname_col_upper on tblname (UPPER(col) text_pattern_ops);

Similarly you can change the function upper to any other function that you want.

Bleat answered 18/9, 2019 at 13:56 Comment(0)
C
2

Please Execute below mentioned query for improve the LIKE query performance in postgresql. create an index like this for bigger tables:

CREATE INDEX <indexname> ON <tablename> USING btree (<fieldname> text_pattern_ops)
Concoction answered 11/10, 2017 at 11:40 Comment(1)
This only works if the pattern doesn't start with a wildcard - in this case the first two sample queries both start with a wildcard.Asyllabic
Q
1

Your like queries probably cannot use the indexes you created because:

1) your LIKE criteria begins with a wildcard.

2) you've used a function with your LIKE criteria.

Quietude answered 14/10, 2009 at 14:45 Comment(0)
D
1

for what it's worth, Django ORM tends to use UPPER(text) for all LIKE queries to make it case insensitive,

Adding an index on UPPER(column::text) has greatly sped up my system, unlike any other thing.

As far as leading %, yes that will not use an index. See this blog for a great explanation:

https://use-the-index-luke.com/sql/where-clause/searching-for-ranges/like-performance-tuning

Dateless answered 20/9, 2018 at 14:55 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.