Backup and restore docker named volume
Asked Answered
C

2

10

I have question regarding volumes and ownership.

As an example I'm using this image: privatebin, but this is the same for any case.

First I'm creating volume:

$ docker volume create privatebin-data

From docker inspect, I can see where data is located:

$ docker inspect privatebin-data 
[
    {
        "CreatedAt": "2018-12-04T21:42:46+01:00",
        "Driver": "local",
        "Labels": {},
        "Mountpoint": "/var/lib/docker/volumes/privatebin-data/_data",
        "Name": "privatebin-data",
        "Options": {},
        "Scope": "local"
    }
]

Following instructions from docker hub, I'm starting image:

$ docker run -d --restart="always" --read-only -p 8080:80 -v privatebin-data:/srv/data privatebin/nginx-fpm-alpine:1.1.1

Then I visit http://localhost:8080 and everything is working as expected.

Content of volume now:

$ ls -l /var/lib/docker/volumes/privatebin-data/_data
total 16
drwx------ 3 82 82 4096 Dec  4 21:49 73
-rw-r----- 1 82 82   46 Dec  4 21:49 purge_limiter.php
-rw-r----- 1 82 82  529 Dec  4 21:49 salt.php
-rw-r----- 1 82 82  131 Dec  4 21:49 traffic_limiter.php

I want to backup directory by archiving it:

tar -C /var/lib/docker/volumes/privatebin-data -czf privatebin-data-backup.tar.gz _data

My question is: Can I safely assume that if I restart image, for example on other server, user and group owner will still be 82? Is this proper way of backuping and restoring docker volumes?

Caston answered 4/12, 2018 at 21:39 Comment(0)
B
15

The UID/GID come from inside your image, privatebin/nginx-fpm-alpine. So as long as you create users in the same way/order in there, and nothing changes in your base image, then those ID's will be the same regardless of where you run the image.

My preferred way to backup and restore volumes is to use a utility container, just in case the backend of docker changes, or you decide to move your named volume to another location or external data store. The commands to do that look like:

docker run --rm \
  -v privatebin-data:/source:ro \
  busybox tar -czC /source . >privatebin-data-backup.tar.gz

and

docker run --rm -i \
  -v privatebin-data:/target \
  busybox tar -xzC /target <privatebin-data-backup.tar.gz
Baloney answered 4/12, 2018 at 22:49 Comment(11)
For some reason I thought that 82:82 was somehow random. But after your reply I dug up that this is not random at all. So that clarifies it. As for second part, quick followup question: is there any adventage of taking data from inside container over directly from host? What's the difference?Caston
@RobertMaguda Using a utility container allows the named volume to change its source without changing the backup command. You can mount more than just volumes from /var/lib/docker using the local driver, and there are other drivers that let you use 3rd party storage systems.Baloney
@Baloney Could you add some explanation to the backup and restore docker run commands please? They seem to be the commands I am looking for, but I couldn't understand completely as I lack busybox / tar related knowledge.Hematite
@Hematite busybox is just a minimal distribution that includes commands like tar. The man page on tar covers most of what you're looking for. Other than that, it's using stdin/stdout from the container to transfer the tar file to/from the container with the volume mounted from/to the host. man7.org/linux/man-pages/man1/tar.1.htmlBaloney
@Baloney Perfect, I had no idea what busybox and tar are. I think I have enough information to look further into this, thanks to you. Is there an industry standard for backing up docker volumes by any chance? Official docker docs seem to be suggesting an approach similar to your answer above: docs.docker.com/storage/volumes/…Hematite
@Baloney I believe a great benefit of using Busybox would be it's smaller footprint perhapsHematite
@Hematite compared to ubuntu at almost 74MB, busybox is a little over 1MB, so I use busybox when I don't need the functionality of a full distribution (e.g. not needing to install additional packages with a package manager). This is as close to a standard as I'm aware of. Other solutions that backup the underlying filesystem directly will fail when you configure named volumes to store data in a different location.Baloney
@Baloney makes perfect sense. Thank you for sharing your knowledge. Stay safe :)Hematite
If I understood correctly, the restore command will not replace the content of the volume with the content of the tar.gz, but will merge them both into the volume, right? I tried it with postgres-data, and it left the database in an inconsistent state. Is there a way to modify the tar command in order to replace the content? Or do I need to delete the volume first, making absolutely sure that the tar.gz can be extracted afterwards?Aerometer
@EricDuminil the restore command assumes an empty volume. Otherwise it only creates/replaces the files that were in the backup, and other files would remain.Baloney
Thanks. I created a makefile task for backup and restore. I first check if tar file is available, then remove the volume, create it again, and then use your command. Something like that : pastebin.com/T07902Rh WARNING: The existing volume will be deleted. This might be a feature or a bug.Aerometer
B
1

You can use docker-volume-snapshot to backup docker named volume in.

After installing the utility, just enter this command

docker-volume-snapshot create privatebin-data privatebin-data-backup.tar
Brummett answered 29/10, 2022 at 17:48 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.