Postgres SLOWER when a LIMIT is set: how to fix besides adding a dummy `ORDER BY`?
Asked Answered
S

1

0

In Postgres, some queries are a whole lot slower when adding a LIMIT:

The queries:

SELECT * FROM review WHERE clicker_id=28 ORDER BY done DESC LIMIT 4; -- 51 sec
SELECT * FROM review WHERE clicker_id=28 ORDER BY id, done DESC LIMIT 4; -- 0.020s
SELECT * FROM review WHERE clicker_id=28 LIMIT 4; -- 0.007s
SELECT * FROM review WHERE clicker_id=28 ORDER BY id; -- 0.007s

As you can see, I need to add a dummy id to the ORDER BY in order for things to go fast. I'm trying to understand why.

Running EXPLAIN on them:

EXPLAIN SELECT * FROM review WHERE clicker_id=28 ORDER BY done DESC LIMIT 4;
EXPLAIN SELECT * FROM review WHERE clicker_id=28 ORDER BY id, done DESC LIMIT 4;
EXPLAIN SELECT * FROM review WHERE clicker_id=28 LIMIT 4;
EXPLAIN SELECT * FROM review WHERE clicker_id=28 ORDER BY id;

gives this:

EXPLAIN SELECT * FROM review WHERE clicker_id=28 ORDER BY done DESC LIMIT 4
Limit  (cost=0.44..249.76 rows=4 width=56)
  ->  Index Scan using review_done on review  (cost=0.44..913081.13 rows=14649 width=56)
        Filter: (clicker_id = 28)


EXPLAIN SELECT * FROM review WHERE clicker_id=28 ORDER BY id, done DESC LIMIT 4
Limit  (cost=11970.75..11970.76 rows=4 width=56)
  ->  Sort  (cost=11970.75..12007.37 rows=14649 width=56)
        Sort Key: id, done DESC
        ->  Index Scan using review_clicker_id on review  (cost=0.44..11751.01 rows=14649 width=56)
              Index Cond: (clicker_id = 28)


EXPLAIN SELECT * FROM review WHERE clicker_id=28 LIMIT 4
Limit  (cost=0.44..3.65 rows=4 width=56)
  ->  Index Scan using review_clicker_id on review  (cost=0.44..11751.01 rows=14649 width=56)
        Index Cond: (clicker_id = 28)


EXPLAIN SELECT * FROM review WHERE clicker_id=28 ORDER BY id
Sort  (cost=12764.61..12801.24 rows=14649 width=56)
  Sort Key: id
  ->  Index Scan using review_clicker_id on review  (cost=0.44..11751.01 rows=14649 width=56)
        Index Cond: (clicker_id = 28)

I'm no SQL expert, but I take it Postgres expected the query to be faster than it actually is, and so used a way to fetch the data that's actually inappropriate, correct?

The database:

  • The review table:
    • Contains 22+ million rows.
      • A given user will get 7 066 rows tops.
      • The one in the test (id 28) has 288 at the time.
    • Has this structure:
      • id: bigint Auto Increment [nextval('public.review_id_seq')]
      • type: review_type NULL
      • iteration: smallint NULL
      • repetitions: smallint NULL
      • due: timestamptz NULL
      • done: timestamptz NULL
      • added: timestamptz NULL
      • clicker_id: bigint NULL
      • monologue_id: bigint NULL
    • Has these indexes:
      • UNIQUE type, clicker_id, monologue_id, iteration
      • INDEX clicker_id
      • INDEX done, due, monologue_id
      • INDEX id
      • INDEX done DESC
      • INDEX type

Additional details:

Environment:

  • The queries were ran in development with Postgres 9.6.14.
  • Running the queries into production (Heroku Postgres, version 9.6.16) the difference is less dramatic, but still not great: the slow queries might take 600 ms.

Variable speed:

  • Sometimes, the same queries (be it the exact same, or for a different clicker_id) run a lot faster (under 1 sec), but I don't understand why. I need them to be consistently fast.
  • If I use LIMIT 288 for a user that has 288 rows, then it's so much faster (< 1sec), but if I do the same for a user with say 7066 rows then it's back to super slow.

Before I figured the use of a dummy ORDER BY, I tried these:

None helped.

The question:

My issue in itself is solved, but I'm dissatisfied with it:

  • Is there a name for this "pattern" that consists of adding a dummy ORDER BY to speed things up?
  • How can I spot such issues in the future? (This took ages to figure.) Unless I missed something, the EXPLAIN is not that useful:
    • For the slow query, the cost is misleadingly slow, while for the fast variant it's misleadingly high.
  • Alternative: is there another way to handle this? (Because this solution feels like a hack.)

Thanks!


Similar questions:

Second answered 7/2, 2020 at 14:30 Comment(2)
what happens when you run an Analyze on that table and then run the select ?Mother
@RajVerma Nothing particular, it's still just as slow as before.Second
M
1

The underlying issue here is what's called an abort-early query plan. Here's a thread from pgsql-hackers describing something similar:

https://www.postgresql.org/message-id/541A2335.3060100%40agliodbs.com

Quoting from there, this is why the planner is using the often-extremely-slow index scan when the ORDER BY done DESC is present:

As usual, PostgreSQL is dramatically undercounting n_distinct: it shows chapters.user_id at 146,000 and the ratio of to_user_id:from_user_id as being 1:105 (as opposed to 1:6, which is about the real ratio). This means that PostgreSQL thinks it can find the 20 rows within the first 2% of the index ... whereas it actually needs to scan 50% of the index to find them.

In your case, the planner thinks that if it just starts going through rows ordered by done desc (IOW, using the review_done index), it will find 4 rows with clicker_id=28 quickly. Since the rows need to be returned in "done" descending order, it thinks this will save a sort step and be faster than retrieving all rows for clicker 28 and then sorting. Given the real-world distribution of rows, this can often turn out not to be the case, requiring it to skip a huge number of rows before finding 4 with clicker=28.

A more-general way of handling it is to use a CTE (which, in 9.6, is still an optimization fence - this changes in PG 12, FYI) to fetch the rows without an order by, then add the ORDER BY in the outer query. Given that fetching all rows for a user, sorting them, and returning however many you need is completely reasonable for your dataset (even the 7k-rows clicker shouldn't be an issue), you can prevent the planner from believing an abort-early plan will be fastest by not having an ORDER BY or a LIMIT in the CTE, giving you a query of something like:

WITH clicker_rows as (SELECT * FROM review WHERE clicker_id=28)
select * From clicker_rows ORDER BY done DESC LIMIT 4;

This should be reliably fast while still respecting the ORDER BY and the LIMIT you want. I'm not sure if there's a name for this pattern, though.

Mathematical answered 7/2, 2020 at 17:27 Comment(3)
Thanks again for this answer (from almost 1 year and 9 months ago!) The ORDER BY trick worked in Postgres 9.6, but not in Postgres 13, with queries taking minutes. Using a CTE as in your example (or a subquery) does work nicely.Second
CTE method doesn't work for Postgres 14.Inexpungible
Will come back later to edit the answer for PG 12+, but short version is to use AS MATERIALIZED to force the CTE to be evaluated separately.Mathematical

© 2022 - 2024 — McMap. All rights reserved.