Postgresql COALESCE performance problem
Asked Answered
C

2

12

I have this table in Postgresql:

CREATE TABLE my_table
(
    id bigint NOT NULL,
    value bigint,
    CONSTRAINT my_table_pkey PRIMARY KEY (id)
);

There are ~50000 rows in my_table.

The question is, why the query:

SELECT * FROM my_table WHERE id = COALESCE(null, id) and value = ?

is slower than this one:

SELECT * FROM my_table WHERE value = ?

Is there any solution, other than optimizing the query string in app-layer?

EDIT: Practically, the question is how to rewrite the query select * from my_table where id=coalesce(?, id) and value=? to have worst case performance not less than that of select * from my_table where value=? in Postgresql 9.0

Cioffi answered 21/6, 2011 at 13:53 Comment(6)
Your query is rather odd. What is the purpose of id = COALESCE(null, id)? The COALESCE will always return id because id is defined as NOT NULL and id = id will always return true too.Cornhusking
@steve select .... id=id ... is also slow. Any idea why?Cioffi
@tair I think the optimizer includes sleeps to puzzle you because you feed it annoying/useless trick queries.Linalinacre
My guess is there is an index on value and not on id. Just a guess though.Cornhusking
@Linalinacre the actual statement was select * from my_table where id=coalesce(?, id) and value=?, so while being fairly annoying, it not completely useless :)Cioffi
@tair that makes a lot more sense....Linalinacre
P
6

Try rewriting the query of the form

SELECT *
  FROM my_table
 WHERE value = ?
   AND (? IS NULL OR id = ?)

From my own quick tests

INSERT INTO my_table select generate_series(1,50000),1;
UPDATE my_table SET value = id%17;

CREATE INDEX val_idx ON my_table(value);

VACUUM ANALYZE my_table;

\set idval 17
\set pval   0

explain analyze 
SELECT *
  FROM my_table
 WHERE value = :pval
   AND (:idval IS NULL OR id = :idval);

Index Scan using my_table_pkey on my_table  (cost=0.00..8.29 rows=1 width=16) (actual time=0.034..0.035 rows=1 loops=1)
   Index Cond: (id = 17)
   Filter: (value = 0)
 Total runtime: 0.064 ms

\set idval null

explain analyze 
SELECT *
  FROM my_table
 WHERE value = :pval
   AND (:idval IS NULL OR id = :idval);

Bitmap Heap Scan on my_table  (cost=58.59..635.62 rows=2882 width=16) (actual time=0.373..1.594 rows=2941 loops=1)
   Recheck Cond: (value = 0)
   ->  Bitmap Index Scan on validx  (cost=0.00..57.87 rows=2882 width=0) (actual time=0.324..0.324 rows=2941 loops=1)
         Index Cond: (value = 0)
 Total runtime: 1.811 ms
Periwinkle answered 23/6, 2011 at 8:43 Comment(0)
C
5

From creating a similar table, populating it, updating statistics, and finally looking at the output of EXPLAIN ANALYZE, the only difference I see is that the first query filters like this:

Filter: ((id = COALESCE(id)) AND (value = 3))

and the second one filters like this:

Filter: (value = 3)

I see substantially different performance and execution plans when there's an index on the column "value". In the first case

Bitmap Heap Scan on my_table  (cost=19.52..552.60 rows=5 width=16) (actual time=19.311..20.679 rows=1000 loops=1)
  Recheck Cond: (value = 3)
  Filter: (id = COALESCE(id))
  ->  Bitmap Index Scan on t2  (cost=0.00..19.52 rows=968 width=0) (actual time=19.260..19.260 rows=1000 loops=1)
        Index Cond: (value = 3)
Total runtime: 22.138 ms

and in the second

Bitmap Heap Scan on my_table  (cost=19.76..550.42 rows=968 width=16) (actual time=0.302..1.293 rows=1000 loops=1)
  Recheck Cond: (value = 3)
  ->  Bitmap Index Scan on t2  (cost=0.00..19.52 rows=968 width=0) (actual time=0.276..0.276 rows=1000 loops=1)
        Index Cond: (value = 3)
Total runtime: 2.174 ms

So I'd say it's slower because the db engine a) evaluates the COALESCE() expression rather than optimizing it away, and b) evaluating it involves an additional filter condition.

Clisthenes answered 21/6, 2011 at 16:15 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.