Postgres log file contains: missing chunk number 0 for toast value 815441 in pg_toast_2619

D

2

5

Below log message is available in postgres log file several thousand times. How to resolve.

missing chunk number 0 for toast value 815441 in pg_toast_2619.

pg_toast_2619 is the pg_statistic table. it (pg_statistic) contains duplicated records also. How to resolve this situation. What is the reason behind this.

Discernment answered 28/11, 2017 at 14:12 Comment(0)

S

17

Something went wrong with you server. Server crashed? Disk failure? Anyway you could do:

Stop your server and make a physical copy of your data directory to a secure place;
Since pg_statistic is populated by ANALYZE, just clean it DELETE FROM pg_catalog.pg_statistic; and issue an ANALYZE afterwards.

If the error persists:

Enable allow_system_table_mods and then restart your server: ALTER SYSTEM SET allow_system_table_mods = ON; (Postgres 9.4+)
Truncate pg_statistic of the database you're getting the error: TRUNCATE TABLE pg_catalog.pg_statistic;
Analyze entire database again: ANALYZE VERBOSE;
Disable allow_system_table_mods: ALTER SYSTEM RESET allow_system_table_mods;

You may need to REINDEX SYSTEM after doing this.

More info about allow_system_table_mods here.

Studio answered 28/11, 2017 at 16:21 Comment(4)

Does this issue is there only in Postgres 9.2 only? – Discernment 7/12, 2017 at 14:10

I saw happening in a 9.5 cluster. But in that case the problem was the disk...not related to Postgres itself, pg_statistic corruption was just the consequence. – Studio 7/12, 2017 at 15:37

Thank you for this answer this helped a lot! I tried everything and didn't realize I could just delete from the pg_catalog.pg_statistic table. – Rikki 29/5, 2019 at 16:27

First two steps worked for me, Thank you @MichelMilezzi – Vanderpool 21/1, 2022 at 5:28

Y

3

Let's suppose that the corrupted table is called mytable. Many articles on the Internet suggest to fire the following query against the database:

psql> select reltoastrelid::regclass from pg_class where relname = 'mytable';

 reltoastrelid      
-------------------------
 pg_toast.pg_toast_40948
(1 row)

and then to fire the following commands:

REINDEX table mytable;
REINDEX table pg_toast.pg_toast_40948;
VACUUM analyze mytable;

But in my case this was not enough. Then, I computed the number of rows in mytable:

psql> select count(*) from mytable;

 count
-------
 58223

To find the corruption, it is possible to fetch data from the table until getting the 'Missing chunk...' error. So the following group of queries does the job:

select * from mytable order by id limit 5000 offset 0;
select * from mytable order by id limit 5000 offset 5000;
select * from mytable order by id limit 5000 offset 10000;
select * from mytable order by id limit 5000 offset 15000;
select * from mytable order by id limit 5000 offset 20000;
...

...and so on until getting the error. In this example, if you reach the offset of 55000 (55000 + 5000 is 60000 which exceeds the total number of records) without getting the error, then your table is not corrupted. The order by clause is necessary to make your query repeatable, i.e. assure that the query does not randomly return rows, and limit and offset clauses work as expected. If your table does not have an id field, you have to find a good field to order by. For performance reasons, it is preferable to select an indexed field.

In order to go faster and not get your console dirty, the query can be directly triggered from the console, redirecting the output to /dev/null and printing an error message only in case of error found:

psql -U pgsql -d mydatabase -c "select * from mytable order by id limit 5000 offset 0" > /dev/null || echo "Corrupted chunk read!"

The above syntax means: execute the query and redirect the output to /dev/null or, in case of error (||), write an error message.

Let's suppose that the first query giving the error is the following:

select * from mytable order by id limit 5000 offset 10000;
Corrupted chunk read!
>

Now, you know that the corrupted chunk is in the rows between 10000 and 14999. So, you can narrow the search by halving the query LIMIT clause.

select * from mytable order by id limit 2500 offset 10000;
Corrupted chunk read!
>

So, the error happens to be in the rows between 10000 and 12499. We halve again the rows limit.

select * from mytable order by id limit 1250 offset 10000;
>

Fetching the rows between 10000 and 12499 does not return any error. So the error must be in the rows between 11250 and 12499. We can confirm this by firing the query:

select * from mytable order by id limit 1250 offset 11250;
Corrupted chunk read!
>

So, we halve again the limit.

select * from mytable order by id limit 625 offset 11250;
>
select * from mytable order by id limit 625 offset 11875;
Corrupted chunk read!
>
...

You should continue narrowing until exactly finding the corrupted row:

...
select * from mytable order by id limit 1 offset 11963;
Corrupted chunk read!
>

Note that in this last query the LIMIT 1 clause exactly identifies only one row.

Finally, you have to find the id of the corrupted row and delete it (obviously you have a data loss):

psql> select id from mytable order by id limit 1 offset 11963;
   id
--------
 121212

psql> delete from mytable where id = 121212;
DELETE 1
>

During the search of the corrupted row, consider that, most likely, the corruption is in the last inserted/updated records, even if this is not a general rule. So you can choose a sort key that respects the physical insert/update so to lower the scan time.

If you prefer to fully automate the corrupted row search, consider using the following script (in csh syntax):

#!/bin/csh
set j = 0
while ($j < 58223) //here the total number of table rows
  psql -U pgsql -d mydatabase -c "SELECT * FROM mytable LIMIT 1 offset $j" >/dev/null || echo $j
  @ j++
end

This script prints the number of all the corrupted rows. In case of long tables, it can take long time since it performs as many queries as the number of table rows.

I posted this same problem in a gist of mine, here.

Yolandayolande answered 20/4, 2021 at 20:15 Comment(0)

Recommended topics

Hot tags