Postgresql explicit VACUUM vs. auto-VACUUM: Differences? Recommendations?
Asked Answered
B

1

5

Quick question from a PostgreSQL (relative) newb:

We run a batch process that, as its final step, deletes most of the previous batches.

Disk space is a concern, so we need to ensure that PostgreSQL cleans up after itself.

Other than forcing PostgreSQL to garbage-collect faster, is there any difference between explicitly calling VACUUM at the end of the batch vs. letting the auto-VACUUM daemon handle it? Is there any reason to recommend one approach vs. the other?

Thanks!

Bennir answered 8/1, 2018 at 15:33 Comment(2)
A manual VACUUM will be a bit harder on I/O than autovacuum, as the latter is designed to minimize the impact on the server. Note that VACUUM/autovacuum will not reclaim space, they merely mark the slots as ready to be reused. If you actually need to reclaim the space, you'll probably want to investigate CLUSTER or VACUUM FULL etc.Deltoid
In addition: Maybe you can write the final results to a new table and drop the old one rather than deleting most of the table. That might be faster, and space will be reclaimed automatically.Childers
F
8

Way back when there was one vacuum, and it was full and blocking. Then PostgreSQL guys added non-blocking vacuum. But you still had to schedule it yourself.

Then some genius made a daemon that ran vacuum automatically for you when the tables needed it. It uses the exact same vacuum command you or I would use, but has a lot of settings, especially default ones, that make it run slower and less intrusively. Primarily these settings are for worker threads (default 3), delay cost (20ms for autovac, 0ms for regular vac) and autovacuum cost delay limit (-1, i.e. use system setting which is 200).

Therefore, regular vacuum is VERY aggressive with no cost delay, and will run as hard and fast as your IO subsystem will let it. It basically competes with your regular workload for IO bandwidth.

Generally you can do one of two things in your situation:

One: Make autovacuum more aggressive. By lowering the autovacuum_vacuum_cost_delay from 20 to something in the 2 to 5 range it will run much faster but still not get in the way too much.

Two: Run regular vacuums by hand. Since regular vacuums, by default, have no cost_delay, this will be the fastest but also the most distruptive.

Decision is yours based on usage patterns etc.

Fasto answered 9/1, 2018 at 21:5 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.