In PostgreSQL 9.3, there are multiple ways to build an expression, which points to a json field's nested property:
data->'foo'->>'bar'
data#>>'{foo,bar}'
json_extract_path_text(data, 'foo', 'bar')
Therefore PostgreSQL only use these indexes, if the query's expression is an exact match with the index's expression.
CREATE TABLE json_test_index1(data json);
CREATE TABLE json_test_index2(data json);
CREATE TABLE json_test_index3(data json);
CREATE INDEX ON json_test_index1((data->'foo'->>'bar'));
CREATE INDEX ON json_test_index2((data#>>'{foo,bar}'));
CREATE INDEX ON json_test_index3((json_extract_path_text(data, 'foo', 'bar')));
-- these queries use an index, while all other combinations not:
EXPLAIN SELECT * FROM json_test_index1 WHERE data->'foo'->>'bar' = 'baz';
EXPLAIN SELECT * FROM json_test_index2 WHERE data#>>'{foo,bar}' = 'baz';
EXPLAIN SELECT * FROM json_test_index3 WHERE json_extract_path_text(data, 'foo', 'bar') = 'baz';
My questions are:
Is this behaviour intended? I thought the query optimizer should (at least) use the index with the #>>
operator, when the query contains the appropriate call of json_extract_path_text()
-- and vice versa.
If I want to use more of these expressions in my application (not just one, f.ex. stick to the ->
& ->>
operators), what indexes should I build? (I hope, not all of them.)
Are there any chance, that some future Postgres versions' optimizers will understand the equivalence of these expressions?
EDIT:
When i create an additional operator for that:
CREATE OPERATOR ===> (
PROCEDURE = json_extract_path_text,
LEFTARG = json,
RIGHTARG = text[]
);
This query (table from the previous example) still not uses its index:
EXPLAIN SELECT * FROM json_test_index3 WHERE data ===> '{foo,bar}' = 'baz';
Bonus question:
While Postgres expands the operators into function calls (behind the scenes), why this still not using its index?
#>>
operator should be able to make use of the index (similar to the current capabilities when indexing thehstore
data type). If you have "flat" key/value pairs that you need to index, currently your only option ishstore
(or wait for 9.4) – Bisterjsonb
fields will give a much more flexible index, it will come with a cost of performance. I need to query only a few properties inside the json object and specific properties' own indexes should be faster. I just wonder, if there are multiple ways to get a property, why i need to build an index for all of them, to make use of one (in every case). Or am i missing something? – Defroster