How list all tables with data changes in the last 24 hours?
Asked Answered
H

1

7

We had an ugly problem, by mistake, a balancer redirect some requests to a test instance with pretty similar data than production, now I know that there are data recorded in the test Postgres that belongs to production

Is there a way to list all the tables with data changes in the last 24 hours in Postgres?

Postgres version is 9.3 and I have around 250 tables.

Hypogenous answered 10/9, 2015 at 22:13 Comment(1)
Before you do anything else: shut down the test DB cluster immediately and make a file-level copy of the whole DB data directory. Internal data and / or current physical order of rows may hold valuable clues - until you do something else to the tables (including VACUUM). After starting the DB cluster again, also draw a plain backup of the test DB.Risibility
R
8

First consider my comment.

Postgres up to and including 9.4 does not by itself record timestamps when rows were inserted or updated.

There are some system columns in the row headers that can help in the forensic work. The physical order of rows (ctid) can be an indicator if nothing else has happened to the table since. In simple cases new rows are appended to the physical end of a table when inserted, so the ctid indicates what was inserted last - until anything changes in the table. Postgres is free to rearrange the physical order of rows at will, for instance with VACUUM. Any UPDATE also writes a new row version, which can change the physical position. The new version does not have to be at the end of the table. Postgres tries to keep new row version on the same data page if possible (HOT update) ...

That said, here is a simple query to get the physically last rows for a given table:

SELECT ctid, *
FROM   big
ORDER  BY ctid DESC
LIMIT  10;

Related answers on dba.SE with detailed information:

The insert transaction id xmin can be useful:

If you happen to have a backup for the test DB from right before the incident, that would be helpful. Restore the old state to a separate schema of the test DB and compare tables ...

Typically, I add one or two timestamptz columns to important tables for when the row was inserted, and / or when it was updated the last time. That would be tremendously useful for you right now ...

What would also be great for you: the "temporal" features introduced in the SQL standard with SQL:2011. But that's not implemented in Postgres, yet.
There's a page in the Postgres Wiki.
There is also an unofficial extension on PGXN. I have not tested it and can't say how far it is.

Postgres 9.5 introduces a feature to record commit timestamps (like @Craig commented). Needs to be enabled manually before it starts recording. The manual:

track_commit_timestamp (bool)

Record commit time of transactions. This parameter can only be set in postgresql.conf file or on the server command line. The default value is off.

And some functions to work with it.

Risibility answered 10/9, 2015 at 22:57 Comment(8)
Commit timestamps, introduced in 9.5, would also help with this if enabled.Herschelherself
@Craig: Very interesting. Would be gold for the situation at hand. Details in the 9.5 manual: postgresql.org/docs/9.5/static/… and postgresql.org/docs/9.5/static/… I'll add that to my answer if you don't mind.Risibility
Of course, go ahead. Note that commit timestamp collection must be enabled beforehand, though, and that commit timestamp information is not retained indefinitely.Herschelherself
Cleanup will be nasty if the test <--> production tables have separate surrogate keys, but no natural keys. Basically insolvable.Coca
I have a clean backup of the test db, ctid looks a good option, could you share a sample query please? I guess that will be posible make something like this mysql answer #2492540Hypogenous
@martinkenneth: I added a sample query and some links.Risibility
@martinkenneth: BTW, the question you linked is for MySQL and answers do not work for Postgres at all.Risibility
@ErwinBrandstetter yeah, I know, I was guessing that something similar was available in postgres, sadly noHypogenous

© 2022 - 2024 — McMap. All rights reserved.