TimescaleDB: SELECT COUNT(*) slow on hypertable
Asked Answered
V

1

5

I have a hyper table for exchange candle data set up using TimescaleDB.

  • TimescaleDB official image timescale/timescaledb:latest-pg12 set up and running with Docker with the exact version string starting PostgreSQL 12.6 on x86_64-pc-linux-musl, compiled by gcc (Alpine 10.2.1_pre1) 10.2.1 20201203, 64-bit

  • Python 3 client

  • The table has 5 continuous aggregate views set up like here and around 15 colums

Running the following query is slow (count query generated with SQLAlchemy):

SELECT count(*) AS count_1
FROM (SELECT my_hypertable.timestamp AS my_hypertable_timestamp
FROM my_hypertable) AS anon_1

Getting a count for a hypertable with 14,000 rows takes 7 seconds on a beefy server, measured from Python script. Whereas for a normal table the query time seems to be sub 10 ms. This makes the COUNT(*) on hypertable unusable in most of the situations.

  • Is there something in TimescaleDB architecture that makes COUNT(*) especially slow?

  • If there is something wrong with my set-up, any pointers or guesses what it could be or how to diagnose this further

  • If the count() is going to be super slow for architectural reasons, what are the recommendations to work around this - or should it be avoided altogether?

Veradia answered 12/6, 2021 at 21:20 Comment(1)
cybertec-postgresql.com/en/postgresql-count-made-fastAvalokitesvara
H
9

you can try the approximate_row_count() function (https://docs.timescale.com/api/latest/analytics/approximate_row_count/) which gives an immediate result.

Harbourage answered 13/6, 2021 at 5:10 Comment(1)
Thank you so much Benoit! Is there any particular reason why the normal COUNT is slow, so I could understand the hypertable limitations better?Veradia

© 2022 - 2024 — McMap. All rights reserved.