Improving PARTITION BY performance?
Asked Answered
B

2

7

Consider the following table :

 foo | bar
-----+-----
  3  |  1
  8  |  1
  2  |  1
  8  |  5
  6  |  5
  5  |  5
  4  |  5
  5  |  7
  4  |  7

Column foo contains whatever. Column bar is almost ordered and rows of common bar value follow each other. Table contains ~1.7 million rows in total and about 15 rows for each distinct bar value.

I find PARTITION BY quite slow and am wondering if I can do anything to improve its performance ?

I tried to CREATE INDEX bar_idx ON foobar(bar) but it had no effect on performance (IRL there is already a primary key on another column of the table). I'm using PostgreSQL 9.3.5.

Here are the EXPLAIN ANALYZE for a simple query with and without PARTITION BY :

> EXPLAIN ANALYZE SELECT count(foo) OVER (PARTITION BY bar) FROM foobar;
                                                           QUERY PLAN                                                       
--------------------------------------------------------------------------------------------------------------------------------
 WindowAgg  (cost=262947.92..293133.35 rows=1724882 width=8) (actual time=2286.082..3504.372 rows=1724882 loops=1)
   ->  Sort  (cost=262947.92..267260.12 rows=1724882 width=8) (actual time=2286.068..2746.047 rows=1724882 loops=1)
         Sort Key: bar
         Sort Method: external merge  Disk: 27176kB
         ->  Seq Scan on foobar  (cost=0.00..37100.82 rows=1724882 width=8) (actual time=0.019..441.827 rows=1724882 loops=1)
 Total runtime: 3606.695 ms
(6 lignes)

> EXPLAIN ANALYZE SELECT foo FROM foobar;
                                                     QUERY PLAN                                                 
--------------------------------------------------------------------------------------------------------------------
 Seq Scan on foobar  (cost=0.00..37100.82 rows=1724882 width=4) (actual time=0.014..385.931 rows=1724882 loops=1)
 Total runtime: 458.776 ms
(2 lignes)

First improvement, increasing work_mem :

In most cases increasing work_mem, as suggested by hbn, should help. In my case I'm working on SSD, so switching to RAM (increasing work_mem to 1 GB) only reduces the processing time by 1.5 :

> EXPLAIN (ANALYZE, BUFFERS) SELECT foo OVER (PARTITION BY bar) FROM foobar;
                                                           QUERY PLAN                                                           
--------------------------------------------------------------------------------------------------------------------------------
 WindowAgg  (cost=215781.92..245967.35 rows=1724882 width=8) (actual time=933.575..1931.656 rows=1724882 loops=1)
   Buffers: shared hit=2754 read=17098
   ->  Sort  (cost=215781.92..220094.12 rows=1724882 width=8) (actual time=933.558..1205.314 rows=1724882 loops=1)
         Sort Key: bar
         Sort Method: quicksort  Memory: 130006kB
         Buffers: shared hit=2754 read=17098
         ->  Seq Scan on foobar  (cost=0.00..37100.82 rows=1724882 width=8) (actual time=0.023..392.446 rows=1724882 loops=1)
               Buffers: shared hit=2754 read=17098
 Total runtime: 2051.494 ms
(9 lignes)

Second improvement, using CLUSTER :

I tried a few of the suggestions of this post — increasing statistics had no significant effect in my case. The only one which helped or was not already active was to "rewrite the table in the physical order of the index", using CLUSTER (you may prefer pg_repack, read original post) :

> CLUSTER foobar USING bar_idx;
CLUSTER
> EXPLAIN (ANALYZE, BUFFERS) SELECT count(foo) OVER (PARTITION BY bar) FROM foobar;
                                                                  QUERY PLAN                                                                  
----------------------------------------------------------------------------------------------------------------------------------------------
 WindowAgg  (cost=0.43..150079.25 rows=1724882 width=8) (actual time=0.031..1372.416 rows=1724882 loops=1)
   Buffers: shared hit=64 read=24503
   ->  Index Scan using bar_idx on foobar  (cost=0.43..124206.02 rows=1724882 width=8) (actual time=0.018..581.665 rows=1724882 loops=1)
         Buffers: shared hit=64 read=24503
 Total runtime: 1484.974 ms
(5 lignes)

Third improvement, subset of table :

In my case I eventually need to select on this table along with another table, so it seems to make sense to create a subset of the table as a table of its own :

CREATE TABLE subfoobar AS (SELECT * FROM foobar WHERE bar IN (SELECT DISTINCT bar FROM othertable) ORDER BY bar);

The new table as only 700k rows instead of 1.7 million, and the query time seems (after recreating an index on bar) roughly proportional, so the gain is significant :

> EXPLAIN (ANALYZE, BUFFERS) SELECT count(foo) OVER (PARTITION BY bar) FROM subfoobar;
                                                                      QUERY PLAN                                                                       
-------------------------------------------------------------------------------------------------------------------------------------------------------
 WindowAgg  (cost=0.42..37455.61 rows=710173 width=8) (actual time=0.025..543.437 rows=710173 loops=1)
   Buffers: shared hit=10290
   ->  Index Scan using bar_sub_idx on subfoobar  (cost=0.42..26803.02 rows=710173 width=8) (actual time=0.015..222.211 rows=710173 loops=1)
         Buffers: shared hit=10290
 Total runtime: 590.063 ms
(5 lignes)

Fourth improvement, summary table :

Since IRL the window function is involved many times in the query, the query itself will be performed many times (data mining), and the result of aggregates over the partitions will always be the same, I decided to opt for a more efficient approach : I extracted all of these values into a new "summary table" (not sure my definition matches the "official" one ?).

In our simple example, this would give

CREATE TABLE summary_foobar AS SELECT DISTINCT ON (bar) count(foo) OVER (PARTITION BY bar) AS cfoo, bar FROM foobar;

Actually, as suggested by hbn in comments, it's even better to create a MATERIALIZED VIEW instead of a new table, so that we can update it anytime simply with REFRESH MATERIALIZED VIEW summary_foobar; :

CREATE MATERIALIZED VIEW summary_foobar AS SELECT DISTINCT ON (bar) count(foo) OVER (PARTITION BY bar) AS cfoo, bar FROM foobar;

Then, applying the initial query to my real case tables :

> EXPLAIN (ANALYZE, BUFFERS) SELECT cfoo FROM subfoobar,summary_foobar WHERE subfoobar.bar=summary_foobar.bar;
                                                          QUERY PLAN                                                      
------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=1254.64..28939.67 rows=424685 width=73) (actual time=9.893..268.704 rows=370393 loops=1)
   Hash Cond: (subfoobar.bar = summary_foobar.bar)
   Buffers: shared hit=8916
   ->  Seq Scan on subfoobar  (cost=0.00..15448.73 rows=710173 width=4) (actual time=0.003..70.850 rows=710173 loops=1)
         Buffers: shared hit=8347
   ->  Hash  (cost=873.73..873.73 rows=30473 width=77) (actual time=9.872..9.872 rows=30473 loops=1)
         Buckets: 4096  Batches: 1  Memory Usage: 3347kB
         Buffers: shared hit=569
         ->  Seq Scan on summary_foobar  (cost=0.00..873.73 rows=30473 width=77) (actual time=0.003..4.569 rows=30473 loops=1)
               Buffers: shared hit=569
 Total runtime: 286.910 ms [~550 ms if using foobar instead of subfoobar]
(11 lignes)

All in all, for my real case queries I went down from 5000+ ms per query to about 150 ms (less than the example because of the WHERE clauses).

Brand answered 30/9, 2014 at 17:31 Comment(3)
Glad you managed to find a way to optimise things. If you're on >=9.3, you might find it useful to use a materialized view for your summary table. Create with something like CREATE MATERIALIZED summary_foobar AS SELECT DISTINCT ON (bar) count(foo) OVER (PARTITION BY bar) AS cfoo, bar FROM foobar;. You can query in the same way, but it's easy to refresh the contents with REFRESH MATERIALIZED VIEW summary_foobar;, rather than dropping and recreating / truncating and reinserting.Wame
Apologies, I missed out "VIEW" in the create command - s/b CREATE MATERIALIZED VIEW summary_foobar AS SELECT DISTINCT ON (bar) count(foo) OVER (PARTITION BY bar) AS cfoo, bar FROM foobar;Wame
Great tip, thanks ! I'll update the answer with it.Brand
W
3

You probably need to increase work_mem. Your query is using an on-disk sort. It's using 27MB - try setting work_mem to 64MB or so and see how it performs then.

You can set it per session or transaction as well as in postgresql.conf.

SET work_mem TO '64MB';

Will set it for your current session.

Obviously a sensible value depends on how much RAM you have in your machine and the number of concurrent connections you expect to have.

Wame answered 30/9, 2014 at 21:11 Comment(5)
Good advice, I didn't notice the EXPLAIN ANALYZE mentioned it worked on disk. Anyway, actually I already set work_mem to 1 GB after submitting the question (BTW, command to get current work_mem : show work_mem). However my disk is an SSD, so working on RAM only reduces the processing time by 1.5.Brand
So almost twice as quick? What does the plan with timing look like now?Wame
Can you include buffers in your analyse output too please?Wame
Is foo a text / varchar column? If it is, what is its collation (or the database collation if it's the default)?Wame
Though the table has text columns (with collation fr_FR.UTF-8), in the query tested in this question foo is real and bar is integer.Brand
C
0

Upd 2014-10-28: using (data-)function-based indizes is not possible in this case as I head to learn :-/ (thanks to a_horse_with_no_name) :

  • index definition functions must be IMMUTABLE

    • http://www.postgresql.org/docs/9.3/static/sql-createindex.html

      All functions and operators used in an index definition must be "immutable", that is, their results must depend only on their arguments and never on any outside influence (such as the contents of another table or the current time). This restriction ensures that the behavior of the index is well-defined. To use a user-defined function in an index expression or WHERE clause, remember to mark the function immutable when you create it.

    • it caused confusion to me because it's different from a (STABLE) index scan condition as explained here

So my initial approach should only work with a manually constructed and indexed table column count_bar_by_foo which is updated by a trigger (for all rows containing the same foo value) ... rather ugly if not really useful/necessary/avoidable.


original wrong answer...

If the query is really important you could create a function-based index based on count(bar) over (partition by foo) which would probably be quite small on the disk (in RAM) and quick to search for.

Maybe one has to put it in and use it as a self-written-sql-function like count_by_foo(bar) to make it actually work (I did not check the syntax restriction for the create index conditions for this - the idea is important).

Cheiron answered 28/10, 2014 at 9:42 Comment(1)
You can't create an index based on a window function.Marella

© 2022 - 2024 — McMap. All rights reserved.