PostgreSQL - how to restore very large database
Asked Answered
L

3

13

I got a little problem when I try to restore a large database (almost 32Go in custom format) on my devel database node (this node has less RAM, CPU... than my production server).

My database dumps are generated with a command similar to:

pg_dump -F custom -b myDB -Z 9 > /backup/myDB-`date +%y%m%d`.pg91

And when I restore it, I used the following command:

pg_restore -F custom -j 5 -d myDB /backup/myDB-20130331.pg91

But here, each time the restore command failed with an error like:

pg_restore: [archiver (db)] error returned by PQputCopyData: server closed the connection unexpectedly
This probably means the server terminated abnormally
    before or while processing the request.
pg_restore: [archiver] worker process failed: exit code 1
pg_restore: [archiver (db)] error returned by PQputCopyData: server closed the connection unexpectedly
This probably means the server terminated abnormally
    before or while processing the request.
pg_restore: [archiver (db)] error returned by PQputCopyData: server closed the connection unexpectedly
This probably means the server terminated abnormally
    before or while processing the request.
pg_restore: [archiver (db)] error returned by PQputCopyData: server closed the connection unexpectedly
This probably means the server terminated abnormally
    before or while processing the request.

And when I check my postgresql logs, I can read this:

   HINT:  In a moment you should be able to reconnect to the database and repeat your command.
   LOG:  all server processes terminated; reinitializing
   LOG:  database system was interrupted; last known up at 2013-04-02 11:41:48 UTC
   LOG:  database system was not properly shut down; automatic recovery in progress
   LOG:  redo starts at 86/26F302B0
   LOG:  unexpected pageaddr 85/E3F52000 in log file 134, segment 38, offset 16064512
   LOG:  redo done at 86/26F51FC0
   LOG:  last completed transaction was at log time 2013-04-02 11:50:47.663599+00
   LOG:  database system is ready to accept connections
   LOG:  autovacuum launcher started

It's quite strange, my postgresql server "restarts" alone just because of my restore. I try to minimize the number of jobs (-j 5 option) but still got the same problem. However on a node with better specs, I have no problem to restore this database. I'm not sure but maybe the updates of my indexes (one of them is really too large) could be a clue to understand this issue?

So I have some questions: is there a better way to restore really large database? Do I miss something in my pg_restore command? May be the settings of my devel server are too low?

Any clue will be greatly appreciated. Thank in advance.

env: PostgreSQL 9.1 (installed via Debian packages)

Lipocaic answered 2/4, 2013 at 13:37 Comment(10)
Looks like a backend crash, but you'll need to show more of the logs to really say for sure or to know why.Loads
Hi @CraigRinger, you suggest me to make the log more verbose to understand what's going on? Ok I'll try to do it and hope to see more infoLipocaic
You might also reduce again the number of jobs from 5 to just 2. That takes longer, but may be less demanding on your development node.Rileyrilievo
Hi @thisfeller, yes I already reduce the number of jobs but still have the same problemLipocaic
If the development box is Intel, I'd suggest to test the memory with memtest86, since that's the kind of symptom of faulty memorySandberg
hi @DanielVérité, yep you right it could be a point even if I see nothing special about it, I'll try to test my memory as soon as possibleLipocaic
If you are running linux, your pg process is most likely the victim of the oom killer. You should have a look at the kernel logs to see if indeed the oom killer has been triggered by lack of free memory.Vargas
@Erwan No, I'm saying you should show more of the existing log, the last 30 lines or so rather than just the last 10.Loads
@DanielVérité Checking dmesg for hardware error monitoring alerts can be useful on modern hardware too. It'll also show any OOM killer invocations.Loads
I have a clue about my issue... I left the autovacuum_max_workers with the default value (which is 3) and with the value of maintenance_work_mem I set, it could exceed the amount of memory of my node (if all the workers are launched). I reduced the amount of workers and relaunch my restoration... let's see if it gonna workLipocaic
L
19

For this kind of big work, it is recommended to disable the autovacuum (by set it to off in your postgresql.conf) during the restoration process.

It seems it finally works for me.

Lipocaic answered 12/4, 2013 at 8:56 Comment(7)
This worked for me – can you provide a source (or more information) on why it's recommended to disable autovacuum, or what it's specifically doing that would crash the database? Thanks!Disperse
vaccum process could degrade the performance of your system (generate statistic, increase the I/O traffic...) so it's interesting to disable it to avoid this kind of problem. However, make vaccum is important to keep statistic uptodate but not necessary during the dump. Don't forget to manually launch it after or to enable the autovaccum daemon.Lipocaic
@Lipocaic could you clarify it is crucial during dump or restore because in answer you are talking about restore and in comment about dump, or you simply mean both? ;)Discussion
@Discussion in my case it was during the restore!Lipocaic
How to do this in PGadmin?Resuscitator
That can only work by coincidence, since autovacuum doesn't crash PostgreSQL.Crawley
The command alter system set autovacuum = off in psql seems to work too.Miyasawa
T
1

This is already told in a comment but I think this deserves a separate answer:

alter system set autovacuum = off

As there might be other problems for other processes; it might be better to switch it on after restore

alter system set autovacuum = on
Thayne answered 11/3 at 21:11 Comment(0)
M
0

I experienced the same error from pg_restore. And when I looked into the postgres logs I found out that my disk size (this was in a containerized database) was 100% full.

Marauding answered 17/10, 2023 at 6:51 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.