How do I fix Postgres so it will start after an abrupt shutdown?
Asked Answered
K

7

22

Due to a sudden power outage, the Postgres server running on my local machine shut down abruptly. After rebooting, I tried to restart Postgres and I get this error:

$ pg_ctl -D /usr/local/pgsql/data restart

pg_ctl: PID file "/usr/local/pgsql/data/postmaster.pid" does not exist
Is server running?
starting server anyway
server starting
$:/usr/local/pgsql/data$ LOG:  database system shutdown was interrupted at 2009-02-28 21:06:16 
LOG:  checkpoint record is at 2/8FD6F8D0
LOG:  redo record is at 2/8FD6F8D0; undo record is at 0/0; shutdown FALSE
LOG:  next transaction ID: 0/1888104; next OID: 1711752
LOG:  next MultiXactId: 2; next MultiXactOffset: 3
LOG:  database system was not properly shut down; automatic recovery in progress
LOG:  redo starts at 2/8FD6F918
LOG:  record with zero length at 2/8FFD94A8
LOG:  redo done at 2/8FFD9480
LOG:  could not fsync segment 0 of relation 1663/1707047/1707304: No such file or directory
FATAL:  storage sync failed on magnetic disk: No such file or directory
LOG:  startup process (PID 5465) exited with exit code 1
LOG:  aborting startup due to startup process failure

There is no postmaster.pid file in the data directory. What possibly could be the reason for this sort of behavior and of course what is the way out?

Koziara answered 28/2, 2009 at 15:53 Comment(3)
What postgres version are you using, and what is the type of the filesystem for /usr/local/pgsql/data?Covell
Just so you know, chances are you might have to restore from backup. But before doing that, please share with us your Postgres version (in v8.1.5 and v8.1.6 IIRC there was a bug triggering this error during recovery) and filesystem type (you might want to change that before the next outage.)Covell
hint: "restart", you're telling PostgreSQL that it is running and needs to be restarted. It's not running, thus there is no process id (.pid) file.Gradate
G
28

You'd need to pg_resetxlog. Your database can be in an inconsistent state after this though, so dump it with pg_dumpall, recreate and import back.

A cause for this could be:

  • You have not turned off hardware write cache on disk, which often prevents the OS from making sure data is written before it reports successful write to application. Check with

    hdparm -I /dev/sda

    If it shows "*" before "Write cache" then this could be the case. Source of PostgreSQL has a program src/tools/fsync/test_fsync.c, which tests speed of syncing data with disk. Run it - if it reports all times shorter than, say, 3 seconds than your disk is lying to OS - on a 7500rpm disks a test of 1000 writes to the same place would need at least 8 seconds to complete (1000/(7500rpm/60s)) as it can only write once per route. You'd need to edit this test_fsync.c if your database is on another disk than /var/tmp partition - change

    #define FSYNC_FILENAME "/var/tmp/test_fsync.out"

    to

    #define FSYNC_FILENAME "/usr/local/pgsql/data/test_fsync.out"

  • Your disk is failing and has a bad block, check with badblocks.

  • You have a bad RAM, check with memtest86+ for at least 8 hours.

Glochidium answered 2/3, 2009 at 10:58 Comment(5)
Thanks a ton. I had moved the DB, but i decided to give your option a try. It has worked and the db is restored. pg_resetxlog has done the trick.Koziara
This problem can also happen when an Windows OS Upgrade occoures - Not only does the postmaster become inaccessable, but permissions on the data folder and the service may dissapear. pg_resetxlog solves the first problem.Yorker
This can also simply happen with an incredibly overloaded storage subsystem on linux.Minnieminnnie
Had this after a (very mean) test of overloading the disk subsystem on linux.Minnieminnnie
In newer versions pg_resetxlog was renamed to pg_resetwalPsych
J
6

Reading a few similar messages in the archives of the PostgreSQL mailing list ("storage sync failed on magnetic disk: No such file or directory") seems to indicate that there is a very serious hardware trouble, much worse than a simple power failure. You may have to prepare yourself to restore from backups.

Jaquelin answered 28/2, 2009 at 17:57 Comment(2)
Ant P,Vlad Romascanu and bortzmeyer - Thank you for all your commennts. I have figured out that the hard disk has got corrupted because of the power spike. I have to move postgres to another machine.Koziara
@bortzmeyer: fsck told me that there were errors in the hard disk. so i was misled into believing that the db cannot be restored but the pg_resetxlog command has done the trick. Thanks a lot.Koziara
S
3

Had db corruption too, my actions

docker run -it --rm -v /path/to/db:/var/lib/postgresql/data postgres:10.3 bash
su - postgres
/usr/lib/postgresql/10/bin/pg_resetwal -D /var/lib/postgresql/data -f
Skyjack answered 30/4, 2018 at 20:31 Comment(0)
S
2

I had this same problem and I was about to dump, reinstall and import from db dump (a really painfull process), however I just tried this as the last resource and it worked!

brew services start postgresql

Then I restarted and that was it.

Slavery answered 6/7, 2020 at 16:21 Comment(0)
I
1

Run start instead of restart. Execute the below command:

$pg_ctl -D /usr/local/pgsql/data start
Interpretive answered 7/8, 2009 at 5:6 Comment(1)
I get "pg_ctl: directory "/usr/local/pgsql/data" does not exist" error when I run thisBinary
F
1

Had this problem a couple of times, when my laptop turned off unexpectedly, when on very low battery while running PSQL in the background.

My solution after searching all over was, Hard delete and Reinstall, then import data from db dump.

Steps for Mac with brew to uninstall and reinstall psql 9.6

brew uninstall [email protected]
rm -rf rm -rf /usr/local/var/[email protected]
rm -rf .psql.local .psql_history .psqlrc.local l.psqlrc .pgpass

brew install [email protected]

echo 'export PATH="/usr/local/opt/[email protected]/bin:$PATH"' >> ~/.bash_profile
source ~/.bash_profile

brew services start [email protected]

createuser -s postgres
createuser {ENTER_YOUR_USER_HERE} --interactive

Fragrant answered 26/6, 2020 at 12:49 Comment(0)
E
0

As others stated, a stop + start instead of a restart worked for me. In a Docker environment this would be:

docker stop <container_name>
docker start <container_name>

or when using Docker Compose:

docker-compose stop
docker-compose start
Eminence answered 4/8, 2020 at 1:23 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.