Consider the following table :
foo | bar
-----+-----
3 | 1
8 | 1
2 | 1
8 | 5
6 | 5
5 | 5
4 | 5
5 | 7
4 | 7
Column foo
contains whatever. Column bar
is almost ordered and rows of common bar
value follow each other. Table contains ~1.7 million rows in total and about 15 rows for each distinct bar
value.
I find PARTITION BY
quite slow and am wondering if I can do anything to improve its performance ?
I tried to CREATE INDEX bar_idx ON foobar(bar)
but it had no effect on performance (IRL there is already a primary key on another column of the table). I'm using PostgreSQL 9.3.5.
Here are the EXPLAIN ANALYZE
for a simple query with and without PARTITION BY
:
> EXPLAIN ANALYZE SELECT count(foo) OVER (PARTITION BY bar) FROM foobar;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
WindowAgg (cost=262947.92..293133.35 rows=1724882 width=8) (actual time=2286.082..3504.372 rows=1724882 loops=1)
-> Sort (cost=262947.92..267260.12 rows=1724882 width=8) (actual time=2286.068..2746.047 rows=1724882 loops=1)
Sort Key: bar
Sort Method: external merge Disk: 27176kB
-> Seq Scan on foobar (cost=0.00..37100.82 rows=1724882 width=8) (actual time=0.019..441.827 rows=1724882 loops=1)
Total runtime: 3606.695 ms
(6 lignes)
> EXPLAIN ANALYZE SELECT foo FROM foobar;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
Seq Scan on foobar (cost=0.00..37100.82 rows=1724882 width=4) (actual time=0.014..385.931 rows=1724882 loops=1)
Total runtime: 458.776 ms
(2 lignes)
First improvement, increasing work_mem :
In most cases increasing work_mem
, as suggested by hbn, should help. In my case I'm working on SSD, so switching to RAM (increasing work_mem
to 1 GB) only reduces the processing time by 1.5 :
> EXPLAIN (ANALYZE, BUFFERS) SELECT foo OVER (PARTITION BY bar) FROM foobar;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
WindowAgg (cost=215781.92..245967.35 rows=1724882 width=8) (actual time=933.575..1931.656 rows=1724882 loops=1)
Buffers: shared hit=2754 read=17098
-> Sort (cost=215781.92..220094.12 rows=1724882 width=8) (actual time=933.558..1205.314 rows=1724882 loops=1)
Sort Key: bar
Sort Method: quicksort Memory: 130006kB
Buffers: shared hit=2754 read=17098
-> Seq Scan on foobar (cost=0.00..37100.82 rows=1724882 width=8) (actual time=0.023..392.446 rows=1724882 loops=1)
Buffers: shared hit=2754 read=17098
Total runtime: 2051.494 ms
(9 lignes)
Second improvement, using CLUSTER
:
I tried a few of the suggestions of this post — increasing statistics had no significant effect in my case. The only one which helped or was not already active was to "rewrite the table in the physical order of the index", using CLUSTER
(you may prefer pg_repack
, read original post) :
> CLUSTER foobar USING bar_idx;
CLUSTER
> EXPLAIN (ANALYZE, BUFFERS) SELECT count(foo) OVER (PARTITION BY bar) FROM foobar;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------
WindowAgg (cost=0.43..150079.25 rows=1724882 width=8) (actual time=0.031..1372.416 rows=1724882 loops=1)
Buffers: shared hit=64 read=24503
-> Index Scan using bar_idx on foobar (cost=0.43..124206.02 rows=1724882 width=8) (actual time=0.018..581.665 rows=1724882 loops=1)
Buffers: shared hit=64 read=24503
Total runtime: 1484.974 ms
(5 lignes)
Third improvement, subset of table :
In my case I eventually need to select on this table along with another table, so it seems to make sense to create a subset of the table as a table of its own :
CREATE TABLE subfoobar AS (SELECT * FROM foobar WHERE bar IN (SELECT DISTINCT bar FROM othertable) ORDER BY bar);
The new table as only 700k rows instead of 1.7 million, and the query time seems (after recreating an index on bar
) roughly proportional, so the gain is significant :
> EXPLAIN (ANALYZE, BUFFERS) SELECT count(foo) OVER (PARTITION BY bar) FROM subfoobar;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
WindowAgg (cost=0.42..37455.61 rows=710173 width=8) (actual time=0.025..543.437 rows=710173 loops=1)
Buffers: shared hit=10290
-> Index Scan using bar_sub_idx on subfoobar (cost=0.42..26803.02 rows=710173 width=8) (actual time=0.015..222.211 rows=710173 loops=1)
Buffers: shared hit=10290
Total runtime: 590.063 ms
(5 lignes)
Fourth improvement, summary table :
Since IRL the window function is involved many times in the query, the query itself will be performed many times (data mining), and the result of aggregates over the partitions will always be the same, I decided to opt for a more efficient approach : I extracted all of these values into a new "summary table" (not sure my definition matches the "official" one ?).
In our simple example, this would give
CREATE TABLE summary_foobar AS SELECT DISTINCT ON (bar) count(foo) OVER (PARTITION BY bar) AS cfoo, bar FROM foobar;
Actually, as suggested by hbn in comments, it's even better to create a MATERIALIZED VIEW
instead of a new table, so that we can update it anytime simply with REFRESH MATERIALIZED VIEW summary_foobar;
:
CREATE MATERIALIZED VIEW summary_foobar AS SELECT DISTINCT ON (bar) count(foo) OVER (PARTITION BY bar) AS cfoo, bar FROM foobar;
Then, applying the initial query to my real case tables :
> EXPLAIN (ANALYZE, BUFFERS) SELECT cfoo FROM subfoobar,summary_foobar WHERE subfoobar.bar=summary_foobar.bar;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=1254.64..28939.67 rows=424685 width=73) (actual time=9.893..268.704 rows=370393 loops=1)
Hash Cond: (subfoobar.bar = summary_foobar.bar)
Buffers: shared hit=8916
-> Seq Scan on subfoobar (cost=0.00..15448.73 rows=710173 width=4) (actual time=0.003..70.850 rows=710173 loops=1)
Buffers: shared hit=8347
-> Hash (cost=873.73..873.73 rows=30473 width=77) (actual time=9.872..9.872 rows=30473 loops=1)
Buckets: 4096 Batches: 1 Memory Usage: 3347kB
Buffers: shared hit=569
-> Seq Scan on summary_foobar (cost=0.00..873.73 rows=30473 width=77) (actual time=0.003..4.569 rows=30473 loops=1)
Buffers: shared hit=569
Total runtime: 286.910 ms [~550 ms if using foobar instead of subfoobar]
(11 lignes)
All in all, for my real case queries I went down from 5000+ ms per query to about 150 ms (less than the example because of the WHERE
clauses).
CREATE MATERIALIZED summary_foobar AS SELECT DISTINCT ON (bar) count(foo) OVER (PARTITION BY bar) AS cfoo, bar FROM foobar;
. You can query in the same way, but it's easy to refresh the contents withREFRESH MATERIALIZED VIEW summary_foobar;
, rather than dropping and recreating / truncating and reinserting. – WameCREATE MATERIALIZED VIEW summary_foobar AS SELECT DISTINCT ON (bar) count(foo) OVER (PARTITION BY bar) AS cfoo, bar FROM foobar;
– Wame