Postgres pg_dump effect on production DB
Asked Answered
C

1

12

Is it possible to query how many multiversioned rows are there in a db ?

We want to measure the impact of pg_dump on a production database, and to suspend it in case of need: is it creating too many multiversioned rows?

Thanks in advance

Cyclonite answered 2/5, 2018 at 0:1 Comment(0)
C
16

pg_dump doesn't create any row versions (a.k.a tuples).

The only impact of pg_dump are the increased I/O load and the long running transaction it creates. The long transaction will keep autovacuum from reclaimimg dead tuples for the duration of the dump.

Normally that is no big problem unless you have very high write activity in the database. To mitigate that problem, you could create a streaming replication standby server, set max_standby_streaming_delay to more than the duration of pg_dump and let pg_dump run there.

Coming answered 2/5, 2018 at 7:18 Comment(2)
Thanks for the clarification Laurenz. Anyway, these "dead' tuples are not really dead because they are visible to pg_dump, itself, but it may happen that updates on them produce "multi-versioned" rows maybe even migrating to different blocks and then causing reindexing, which is quite bad. Maybe there is a way to detect when this is happening?Cyclonite
Oh, I understand. Dead tuples cannot be cleaned up during pg_dump, because it is running in a long read only transaction. You can measure the impact by comparing n_dead_tup in pg_stat_all_tables for all tables before and immediately after the dump.Coming

© 2022 - 2024 — McMap. All rights reserved.