How to port data-only volumes from one host to another?
Asked Answered
F

11

175

As described in the Docker documentation on Working with Volumes there is the concept of so-called data-only containers, which provide a volume that can be mounted into multiple other containers, no matter whether the data-only container is actually running or not.

Basically, this sounds awesome. But there is one thing I do not understand.

These volumes (which do not explicitly map to a folder on the host for portability reasons, as the documentation states) are created and managed by Docker in some internal folder on the host (/var/docker/volumes/…).

Supposed I use such a volume, and then I need to migrate it from one host to another - how do I port the volume? AFAICS it has a unique ID - can I just go and copy the volume and its according data-only container to a new host? How do I find out which files to copy? Or is there some support built-in to Docker that I did not discover yet?

Finsen answered 6/2, 2014 at 8:20 Comment(2)
You can export data container directory: docker run --volumes-from <data container> ubuntu tar -cO <volume path> | gzip -c > volume.tgz This does not rely on implementation details of the volumes. And import the data with tar on the second machine.Ga
Wow, that's awesome, thanks :-)))! If you write this comment as an answer, I will accept it gladly!Finsen
P
213

The official answer is available in the section "Back up, restore, or migrate data volumes":

BACKUP:

sudo docker run --rm --volumes-from DATA -v $(pwd):/backup busybox tar cvf /backup/backup.tar /data
  • --rm: remove the container when it exits
  • --volumes-from DATA: attach to the volumes shared by the DATA container
  • -v $(pwd):/backup: bind mount the current directory into the container; to write the tar file to
  • busybox: a small simpler image - good for quick maintenance
  • tar cvf /backup/backup.tar /data: creates an uncompressed tar file of all the files in the /data directory

RESTORE:

# create a new data container
$ sudo docker create -v /data --name DATA2 busybox true
# untar the backup files into the new container᾿s data volume
$ sudo docker run --rm --volumes-from DATA2 -v $(pwd):/backup busybox tar xvf /backup/backup.tar
data/
data/sven.txt
# compare to the original container
$ sudo docker run --rm --volumes-from DATA -v `pwd`:/backup busybox ls /data
sven.txt
Prestigious answered 21/5, 2014 at 9:1 Comment(14)
For now it is better to use docker create for data-only containers so they will not be started. See example in the off. documentation: docs.docker.com/userguide/dockervolumes/…Malorie
So... If I'm trying to backup a Postgres database, what would I replace /data with /var/lib/postgresql/data, correct?Monique
It depends on the distribution you are using, usually is in /var/lib/postgresql/[version]/data/Prestigious
I just want to transfer volume from one machine to another. I don't have any running container. I tried to run above commands but it needs container nameLocust
@Locust How did you create the volume?Prestigious
@Prestigious using command docker create volume <volume-name>. I posted one question also #42973847Locust
The "Backup, restore, or migrate data volumes" section seems to have been removed from the Docker documentation :-(Voyage
@Voyage sadly true, luckily this solution still holds though!Prestigious
@Voyage true, in the updated docs (docs.docker.com/engine/admin/volumes/…) I've found this snippet: "When you need to be able to back up, restore, or migrate data from one Docker host to another, volumes are a better choice. You can stop containers using the volume, then back up the volume’s directory (such as /var/lib/docker/volumes/<volume-name>)." Wonder if this is really portable/advisable as the same docs state that "...which is managed by Docker (/var/lib/docker/volumes/) (...). Non-Docker processes should not modify this part of the filesystem"Callboy
What for is the final true in docker create -v /data --name DATA2 busybox true?Karlykarlyn
@Karlykarlyn it's just a command called to create the data container it could be any command which actually does nothing. The container starts and immediately exits but it is used to persist data.Prestigious
Why is that way preferable in oppose to storing data in a certain host directory via mount bind and then just pack and copy this folder to another host?Nonprofit
@Callboy If for some reason, you need the container to remain running (say, you want to docker exec into it), then a simple command is tail -f /dev/null which will never exit, but uses minimal resources. When you don't need it running anymore, docker stop data-container will do that for you. The volumes remain for other containers.Amulet
getting an error : bzip2: short write tar: write error: Broken pipeMiscarriage
A
35

Extending the official answer from Docker docs and the top answer here, you can have following functions in your .bashrc or .zshrc:

# backup files from a docker volume into /tmp/backup.tar.gz
function docker-volume-backup-compressed() {
  docker run --rm -v /tmp:/backup --volumes-from "$1" debian:jessie tar -czvf /backup/backup.tar.gz "${@:2}"
}

# restore files from /tmp/backup.tar.gz into a docker volume
function docker-volume-restore-compressed() {
  docker run --rm -v /tmp:/backup --volumes-from "$1" debian:jessie tar -xzvf /backup/backup.tar.gz "${@:2}"
  echo "Double checking files..."
  docker run --rm -v /tmp:/backup --volumes-from "$1" debian:jessie ls -lh "${@:2}"
}

# backup files from a docker volume into /tmp/backup.tar
function docker-volume-backup() {
  docker run --rm -v /tmp:/backup --volumes-from "$1" busybox tar -cvf /backup/backup.tar "${@:2}"
}

# restore files from /tmp/backup.tar into a docker volume
function docker-volume-restore() {
  docker run --rm -v /tmp:/backup --volumes-from "$1" busybox tar -xvf /backup/backup.tar "${@:2}"
  echo "Double checking files..."
  docker run --rm -v /tmp:/backup --volumes-from "$1" busybox ls -lh "${@:2}"
}

Note that the backup is saved into /tmp, so you can move the backup file saved there between docker hosts.

There is also two pairs of backup/restore aliases. One using compression and debian:jessie and other with no compression but with busybox. Favor using compression if the files to backup are big.

Aleedis answered 13/1, 2016 at 21:6 Comment(0)
G
19

You can export the volume to tar and transfer to another machine. And import the data with tar on the second machine. This does not rely on implementation details of the volumes.

# you can list shared directories of the data container
docker inspect <data container> | grep "/vfs/dir/"

# you can export data container directory to tgz
docker run --cidfile=id.tmp --volumes-from <data container> ubuntu tar -cO <volume path> | gzip -c > volume.tgz

# clean up: remove exited container used for export and temporary file
docker rm `cat id.tmp` && rm -f id.tmp
Ga answered 6/2, 2014 at 21:32 Comment(7)
thanks for your answer. How can I move the data container from one host to another ?Themselves
@nXqd Data container is created by docker run -v /data-volume -name datacointainer busybox true - you can run this anywhere. After you create data container, you can import tar archive as explained in the answer.Ga
Thanks for your answer. But I met another problem that we need to remove the zombie container which is used to backup afterward. Since this doesn't return id. Do you have any good way :DThemselves
@nXqd Sure - you have to use --cidfile=id.txt as run parameter. The container ID will be stored in the file id.txt. I have updated the answer.Ga
You could just use docker run --rm instead of docker run --cidfile ... ; docker rm.Garvin
@FelixRabe Yes, this answer is not updated, it is better to use the solution from official docker documentation - and that solution do use --rm (see the accepted answer).Ga
You could also create your data container with a name docker run --name data-container ... and later remove it by name docker container rm -f data-containerAmulet
L
19

Just wrote docker-volume-snapshot command for similar usecase. This command is based on tommasop's answer.

With the command,

  1. Create snapshot
docker-volume-snapshot create <volume-name> snapshot.tar
  1. Move snapshot.tar to another host
  2. Restore snapshot
docker-volume-snapshot restore snapshot.tar <volume-name>
Lazaro answered 2/7, 2022 at 14:17 Comment(0)
H
6

I'll add another recent tool here from IBM which is actually made for the volume migration from one container host to another. This is a currently on-going project. So, you may find a different version with additional features in future.

Cargo was developed to migrate containers from one host to another host along with their data with minimal downtime. Cargo uses data federation capabilities of union filesystem to create a unified view of data (mainly the root file system) across the source and target hosts. This allows Cargo to start up a container almost immediately (within milliseconds) on the target host as the data from source root file system gets copied to target hosts either on-demand (using a copy-on-write (COW) partition) or lazily in the background (using rsync).

Important points are:

  • a centralized server handles the migration process

The link to the project is given here:

https://github.com/nadgowdas/cargo

Hoskins answered 1/12, 2017 at 14:36 Comment(0)
N
4

Here's a one-liner in case it can be established an SSH connection between the machines:

docker run --rm -v <SOURCE_DATA_VOLUME_NAME>:/from alpine ash -c "cd /from ; tar -cf - . " | ssh <TARGET_HOST> 'docker run --rm -i -v <TARGET_DATA_VOLUME_NAME>:/to alpine ash -c "cd /to ; tar -xpvf - " '

Credits go to Guido Diepen's post.

Neddie answered 18/4, 2021 at 0:21 Comment(3)
doesn't not work for me, from mac to aws, it makes copy but didn't send itAtwater
Did you check if the SSH connection can be established correctly?Neddie
i tried somethig like ssh -i ~/python/second-for-floo.pem [email protected] insead ssh <TARGET_HOST>Atwater
B
3

In case your machines are in different VPCs or you want to copy from/to local machine (like in my case) you can use dvsync I created. It's basically ngrok combined with rsync over SSH packaged into two small (both ~25MB) images. First, you start the dvsync-server on a machine you want to copy data from (You'll need the NGROK_AUTHTOKEN which can be obtained from ngrok dashboard):

$ docker run --rm -e NGROK_AUTHTOKEN="$NGROK_AUTHTOKEN" \
  --mount source=MY_VOLUME,target=/data,readonly \
  quay.io/suda/dvsync-server

Then you can start the dvsync-client on the machine you want to copy the files to, passing the DVSYNC_TOKEN shown by the server:

docker run -e DVSYNC_TOKEN="$DVSYNC_TOKEN" \
  --mount source=MY_TARGET_VOLUME,target=/data \
  quay.io/suda/dvsync-client 

Once the copying will be done, the client will exit. This works with Docker CLI, Compose, Swarm and Kubernetes as well.

Biota answered 11/7, 2018 at 21:32 Comment(0)
L
1

Adding an answer here as I don't have reputation to comment. While all the above answers have helped me, I imagine there may be others like me who are also looking to copy the contents of a backup.tar file into a named docker volume on the collaborator's machine. I don't see this discussed specifically above or in docker volumes documentation.

Why would you do want to do copy the backup.tar file into a named docker volume?

This could be helpful in a scenario where a named docker volume has been specified inside an existing docker-compose.yml file to be used by some of the containers.

Copying contents of backup.tar into a named docker volume

  1. On host machine, follow the steps in accepted answer or docker volumes documentation to create a backup.tar file and push it to some repository.

  2. Pull backup.tar into collaborator's machine from repository.

  3. On collaborator's machine, create a temporary container and a named docker volume.

docker run -v named_docker_volume:/dbdata --name temp_db_container ubuntu /bin/bash

  • --name temp_db_container : Create a container called temp_db_container

  • ubuntu /bin/bash : Use a ubuntu image to build temp_db_container with starting command of /bin/bash

  • -v named_docker_volume:/dbdata : Mount the /dbdata folder of temp_db_container into a docker volume called named_docker_volume. We use this specifically named volume named_docker_volume to match with volume name specified in our docker-compose.yml file.

  1. On collaborator's machine, Copy over the contents of backup.tar into the named docker volume.

docker run --rm --volumes-from temp_db_container -v $(pwd):/backup ubuntu bash -c "cd /dbdata && tar xvf /backup/backup.tar --strip 1"

  • --volumes-from temp_db_container : temp_db_container container's /dbdata folder was mapped to named_docker_volume volume in previous step. So any file that gets stored in /dbdata folder will immediately get copied over to named_docker_volume docker volume.
  • -v $(pwd):/backup : map the local machine's present working directory to the /backup folder located inside temp_db_container
  • ubuntu bash -c "cd /dbdata && tar xvf /backup/backup.tar --strip 1" : Untar the backup.tar file and store the untarred contents inside /dbdata folder.
  1. On collaborator's machine, clear the temporary container temp_db_container

docker rm temp_db_container

Langmuir answered 6/10, 2021 at 5:8 Comment(0)
E
0

Adapted from the accepted answer, but gives more flexibility in that you can use it in bash pipeline:

#!/bin/bash

if [ $# != 2 ]; then
    echo Usage "$0": volume /path/of/the/dir/in/volume/to/backup
    exit 1
fi

if [ -t 1 ]; then
    echo The output of the cmd is binary data "(tar)", \
         and it should be redirected instead of printed to terminal
    exit 1
fi

volume="$1"
path="$2"

exec docker run --rm --mount type=volume,src="$volume",dst=/mnt/volume/ alpine tar cf - . -C /mnt/volume/"$path"

If you want to backup the volume periodically and incrementally, then you can use the following script:

#!/bin/bash

if [ $# != 3 ]; then
    echo Usage "$0": volume /path/of/the/dir/in/volume/to/backup /path/to/put/backup
    exit 1
fi

volume="$1"
volume_path="$2"
path="$3"

if [[ "$path" =~ ^.*/$ ]]; then
    echo "The 3rd argument shouldn't end in '/', otherwise rsync would not behave as expected"
    exit 1
fi

container_name="docker-backup-rsync-service-$RANDOM"
docker run --rm --name="$container_name" -d -p 8738:873 \
    --mount type=volume,src="$volume",dst=/mnt/volume/ \
    nobodyxu/rsyncd

echo -e '\nStarting syncing...'

rsync --info=progress2,stats,symsafe -aHAX --delete \
    "rsync://localhost:8738/root/mnt/volume/$volume_path/"  "$path"
exit_status=$?

echo -e '\nStopping the rsyncd docker...'
docker stop -t 1 "$container_name"

exit $exit_status

It utilizes rsync's server and client functionality to directly sync the dir between volume and your host dir.

Enidenigma answered 28/4, 2021 at 11:22 Comment(0)
S
0

I was dissatisfied with the answer using tar. I decided to take matters into my own hands. As I am going to be syncing the data often, and it's going to be big, I wanted specifically to use rsync. Using tar to send all the data every time would be just a waste of time and transfer.

After days spent on how to solve the problem of communicating between two remote docker containers, I finally got a solution using socat.

  • run two docker containers - one on the source the other on destination, each with one volume mounted - the source volume and destination volume.
  • run rsync --deamon on one of the containers that will stream/load data from the volume
  • run docker exec source_container socat - TCP:localhost and run docker exec desintation_container socat TCP-LISTEN:rsync - and connect stdin and stdout of both these together. So one socat connects to rsync --daemon and redirects data from/to stdout/stdin, the other socat listens on :rsync port (port 873) and redirect to/from stdin/stdout. Then connect them together, so basically we pipe data from one container port to the other.
  • then run on the other of volumes rsync client that would connect to localhost:rsync, effective connecting via "socat pipe" to the rsync --daemon.

Basically, it works like this:

log "Running both destination and source containers"
src_did=$(
    env DOCKER_HOST=$src_docker_host docker run --rm -d -i -v \
    "$src_volume":/data:ro -w /data alpine_with_rsync_and_socat\
    sleep infinity
)
dst_did=$(
    env DOCKER_HOST=$dst_docker_host docker run --rm -d -i -v \
    "$dst_volume":/data:rw -w /data alpine_with_rsync_and_socat \
    sleep infinity
)

log "Running rsyncd on destination container"
    env DOCKER_HOST=$dst_docker_host docker exec "$dst_did" sh -c "
        cat <<EOF > /etc/rsyncd.conf &&
uid = root
gid = root
use chroot = no
max connections = 1
numeric ids = yes
reverse lookup = no
[data]
path = /data/
read only = no
EOF
        rsync --daemon
    "

log "Setup rsync socat forwarding between containers"
{
    coproc { env DOCKER_HOST=$dst_docker_host docker exec -i "$dst_did" \
       socat -T 10 - TCP:localhost:rsync,forever; }
    env DOCKER_HOST=$src_docker_host docker exec -i "$src_did" \
       socat -T 10 TCP-LISTEN:rsync,forever,reuseaddr - <&"${COPROC[0]}" >&"${COPROC[1]}"
} &

log "Running rsync on source that will connect to destination"
env DOCKER_HOST=$src_docker docker exec -e RSYNC_PASSWORD="$g_password" -w /data "$src_did" \
    rsync -aivxsAHSX --progress /data/ rsync://root@localhost/data

Another the really nice thing about that approach, is that you can copy data between two remote hosts, without ever storing the data locally. I also share the script ,docker-rsync-volumes that I've written around this idea. With that script, copying volume from two remote hosts is just simple ,docker-rsync-volumes --delete -f ssh://user@productionserver grafana_data -t ssh://user@backupserver grafana_data_backup.

Sickly answered 14/9, 2021 at 20:31 Comment(1)
Worth to note you that gzip supports --rsyncable flag. Man gzip: --rsyncable : When you synchronize a compressed file between two computers, this option allows rsync to transfer only files that were changed in the archive instead of the entire archive.Kant
B
0

This ssh copies your volume from one server to another.

docker run --rm -v $VOLUME:/$VOLUME alpine tar -czv --to-stdout -C /$VOLUME . | ssh $REMOTEHOST "docker run --rm -i -v $VOLUME:/$VOLUME alpine tar xzf - -C /$VOLUME"

If you want to copy more than one volume that matches a filter.

[email protected]

Volumes=($(docker volume ls --filter "name=mailcow*" --format="{{.Name}}"))

for VOLUME in ${Volumes[@]}; do
   docker run --rm -v $VOLUME:/$VOLUME alpine tar -czv --to-stdout -C /$VOLUME . | ssh $REMOTEHOST "docker run --rm -i -v $VOLUME:/$VOLUME alpine tar xzf - -C /$VOLUME"
done
Brauer answered 30/5, 2022 at 19:47 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.