how to backup a django db
Asked Answered
Z

2

69

I have a Django application that uses a Postgres database. I need to be able to backup and restore the db, both to ensure no data is lost and to be able to copy data from the production server to the development server during testing.

There seem to be a few different ways to do this:

  1. Just interact with the db directly. So, for Postgres I might write a script using pg_dumpall and psql.

  2. Use the sqlclear/sqlall commands that come with Django.

  3. Use the dumpdata/loaddata commands that come with Django. So create new fixtures from the db you want to backup and then load them into the db you want to restore.

  4. Use a Django plugin like django-dbbackup.

I really don't understand the pros/cons of these different techniques.

Just off the top of my head: Option 1 is database-specific and option 3 seems more suited to setting up initial data. But I'm still not sure what advantages option 4 has over option 2.

Zusman answered 10/1, 2014 at 16:22 Comment(10)
why dont you just create a copy of the entire database ? postgresql.org/docs/8.1/static/backup.html#BACKUP-DUMPBligh
does django-dbbackup even work? I clearly see code there, that hasn't got a chance to work: bitbucket.org/mjs7231/django-dbbackup/src/…Woodward
@Bligh - That would work but the commands are specific to Postgres; if the underlying db changes, I would have to rewrite the script.Zusman
@Woodward - I haven't fully tested it yet. The bit of code you were looking at saves to Amazon S3, I was just going to save to a local file.Zusman
Fair enough, I'd be cautious of code containing such obvious error thought. Especially for tasks as important, as taking backups.Woodward
You mention you ended up writing your own scripts - how do they compare to django-dbbackup? and care to share?Seaton
@Seaton - The script that I wrote can be found here: pastebin.com/3afcrHqe . It assumes a standard Django "settings.py" w/ all the database info.Zusman
@Zusman Fantastic! seems rather sensible - the backup runs fine, after filling in the couple of lines that need tailoring to my project, but what about restoring?Seaton
@Seaton - The restore is very similar: pastebin.com/2hbkwsp0 .Zusman
@Zusman Looks good! Thanks again, upvotes all over :) I'll do some heavy testing on these (django-db backup was giving some UTF-8 related errors on restore) - I was just starting on my own basic bash scripts for this, but your python scripts look much better!Seaton
Z
40

For regular backups I'd go for option 1, using PostgreSQL's own native tool, as it is probably the most efficient.

I would argue that option 2 is primarily concerned with creating the tables and loading initial data so is not suitable for backups.

Option 3 can be used for backups and would be particularly useful if you needed to migrate to a different database platform since the data is dumped in a non-SQL form, i.e. JSON understood by Django.

Option 4 the plugin appears to be using db's own backup tools (as per option 1) but additionally provides help to push your backups into cloud storage in Amazon S3 or Dropbox

Zedekiah answered 10/1, 2014 at 21:18 Comment(4)
I wound up writing my own Python scripts to backup/restore the database. They read from the Django settings module to figure out what type of db it is. Currently, it only supports postgres. But there are hooks for other formats.Zusman
@Zusman can you upload it to Git or somewhere and provide the script. ?Jellicoe
@Jellicoe - I haven't used this code in years. But I you can find links to my old scripts in the comments above: pastebin.com/3afcrHqe and pastebin.com/2hbkwsp0. (Note that these are 5+ years old; Django may have moved on in the meantime.)Zusman
Oh sorry, I did not see the year. Thanks anyway.Jellicoe
S
41

The problem with options 1-3 are that media files (anything uploaded through FileField) are not included in the backup. It is possible to separately backup the directory containing the media files. However, because Django doesn't remove files when they are no longer referenced by a FileField, you will inevitably end up with files in the backup that don't need to be there.

That's why I would go with option #4. In particular, I recommend django-archive*. Some of its features include:

  • Dumps the contents of all important models (by default ContentType, Permission, and Session are excluded since they are populated by manage.py migrate) and lets you choose additional models to exclude.

  • Includes media files referenced by FileField and ImageField fields. Note that only the files referenced by rows in the database are included; files left over by deleted rows are ignored.

  • Produces a single archive containing both the database backup and media files.

  • Provides options for customizing the location where archives should be stored, the filename format, and archive type (gz and bz2).

Installation is as simple as adding django_archive to INSTALLED_APPS and setting options in settings.py if needed. Once installed, you can immediately create an archive of your entire database (including media files) by running:

./manage.py archive

* Disclaimer: I am the author of the package

Stairway answered 16/3, 2016 at 3:9 Comment(7)
That is a good point. For my particular project, I didn't have to worry about media files. But other users may find your package useful.Zusman
Is there a way to call it programmatically from the Django application, rather than manage.py?Bias
Also, I get a broken archive on Windows Python 3.7 .Bias
Does references to content types are restored after migrate?Eppes
@SergeRogatch The issue causing that has been fixed in a subsequent release.Stairway
@SergeRogatch to call it programmatically you could use call_command : from django.core.management import call_commandDeepsix
The package was not interestingPunctilious
Z
40

For regular backups I'd go for option 1, using PostgreSQL's own native tool, as it is probably the most efficient.

I would argue that option 2 is primarily concerned with creating the tables and loading initial data so is not suitable for backups.

Option 3 can be used for backups and would be particularly useful if you needed to migrate to a different database platform since the data is dumped in a non-SQL form, i.e. JSON understood by Django.

Option 4 the plugin appears to be using db's own backup tools (as per option 1) but additionally provides help to push your backups into cloud storage in Amazon S3 or Dropbox

Zedekiah answered 10/1, 2014 at 21:18 Comment(4)
I wound up writing my own Python scripts to backup/restore the database. They read from the Django settings module to figure out what type of db it is. Currently, it only supports postgres. But there are hooks for other formats.Zusman
@Zusman can you upload it to Git or somewhere and provide the script. ?Jellicoe
@Jellicoe - I haven't used this code in years. But I you can find links to my old scripts in the comments above: pastebin.com/3afcrHqe and pastebin.com/2hbkwsp0. (Note that these are 5+ years old; Django may have moved on in the meantime.)Zusman
Oh sorry, I did not see the year. Thanks anyway.Jellicoe

© 2022 - 2024 — McMap. All rights reserved.