Practical limitations of expression indexes in PostgreSQL
Asked Answered
A

1

5

I have a need to store data using the HSTORE type and index by key.

CREATE INDEX ix_product_size ON product(((data->'Size')::INT))
CREATE INDEX ix_product_color ON product(((data->'Color')))
etc.

What are the practical limitations of using expression indexes? In my case, there could be several hundred different types of data, hence several hundred expression indexes. Every insert, update, and select query will have to process against these indexes in order to pick the correct one.

Antons answered 7/6, 2011 at 11:31 Comment(1)
I am using the latest version of PG.Antons
P
4

I've never played with hstore, but I do something similar when I need an EAV column, e.g.:

create index on product_eav (eav_value) where (eav_type = 'int');

The limitation in doing so is that you need to be explicit in your query to make use of it, i.e. this query would not make use of the above index:

select product_id
from product_eav
where eav_name = 'size'
and eav_value = :size;

But this one would:

select product_id
from product_eav
where eav_name = 'size'
and eav_value = :size
and type = 'int';

In your example it should likely be more like:

create index on product ((data->'size')::int) where (data->'size' is not null);

This should avoid adding a reference to the index when there is no size entry. Depending on the PG version you're using the query may need to be modified like so:

select product_id
from products
where data->'size' is not null
and data->'size' = :size;

Another big difference between regular and partial index is that the latter cannot enforce a unique constraint in a table definition. This will succeed:

create unique index foo_bar_key on foo (bar) where (cond);

The following won't:

alter table foo add constraint foo_bar_key unique (bar) where (cond);

But this will:

alter table foo add constraint foo_bar_excl exclude (bar with =) where (cond);
Prospero answered 7/6, 2011 at 12:32 Comment(10)
Thanks. I'm not clear why you say "(data->'size' is not null)" is needed. From PostgreSQL's site: :A partial index is an index built over a subset of a table; the subset is defined by a conditional expression (called the predicate of the partial index). The index contains entries for only those table rows that satisfy the predicate." So surely if I don't add a particular HSTORE key, there won't be an index entry.Antons
Within the index, without a where clause, the index would add an entry for each irrespective of whether the size is defined or not. A partial index (with a where clause) fixes this in that it only adds an entry for rows that meet the condition. This makes the index much smaller, since it only contains entries for the relevant rows.Prospero
Ah, you are merely ensuring it doesn't add an index entry for cases where data is defined with a null value for a key. Ok, I wasn't thinking that would happen, but I get your point.Antons
Hehe, it does happen and that's one among the many cases where partial indexes are useful. :-)Prospero
Correction: I mean an expression index.Antons
In that case no limitations whatsoever... Just call/use your expression and enjoy the index.Prospero
Actually, there is one that you need to know: functions involved in the index must be immutable.Prospero
So I'm not going to experience any notable performance degradation even in the face of 100+ indexes?Antons
100+ indexes will introduce a performance degradation irrespective of what's in them. Keep in mind that, for all intents and purposes, each index is a table in its own right. If you add/update a row in your table, you'll be adding/updating another 100 other rows in the index "tables".Prospero
@Dennis true, but only if the data matches these expression indexes. Thanks for your help. I believe the only real answer will come from some extensive testing.Antons

© 2022 - 2024 — McMap. All rights reserved.