I was dissatisfied with the answer using tar
. I decided to take matters into my own hands. As I am going to be syncing the data often, and it's going to be big, I wanted specifically to use rsync
. Using tar
to send all the data every time would be just a waste of time and transfer.
After days spent on how to solve the problem of communicating between two remote docker containers, I finally got a solution using socat
.
- run two docker containers - one on the source the other on destination, each with one volume mounted - the source volume and destination volume.
- run
rsync --deamon
on one of the containers that will stream/load data from the volume
- run
docker exec source_container socat - TCP:localhost
and run docker exec desintation_container socat TCP-LISTEN:rsync -
and connect stdin and stdout of both these together. So one socat
connects to rsync --daemon
and redirects data from/to stdout/stdin, the other socat
listens on :rsync
port (port 873) and redirect to/from stdin/stdout. Then connect them together, so basically we pipe data from one container port to the other.
- then run on the other of volumes
rsync
client that would connect to localhost:rsync
, effective connecting via "socat
pipe" to the rsync --daemon
.
Basically, it works like this:
log "Running both destination and source containers"
src_did=$(
env DOCKER_HOST=$src_docker_host docker run --rm -d -i -v \
"$src_volume":/data:ro -w /data alpine_with_rsync_and_socat\
sleep infinity
)
dst_did=$(
env DOCKER_HOST=$dst_docker_host docker run --rm -d -i -v \
"$dst_volume":/data:rw -w /data alpine_with_rsync_and_socat \
sleep infinity
)
log "Running rsyncd on destination container"
env DOCKER_HOST=$dst_docker_host docker exec "$dst_did" sh -c "
cat <<EOF > /etc/rsyncd.conf &&
uid = root
gid = root
use chroot = no
max connections = 1
numeric ids = yes
reverse lookup = no
[data]
path = /data/
read only = no
EOF
rsync --daemon
"
log "Setup rsync socat forwarding between containers"
{
coproc { env DOCKER_HOST=$dst_docker_host docker exec -i "$dst_did" \
socat -T 10 - TCP:localhost:rsync,forever; }
env DOCKER_HOST=$src_docker_host docker exec -i "$src_did" \
socat -T 10 TCP-LISTEN:rsync,forever,reuseaddr - <&"${COPROC[0]}" >&"${COPROC[1]}"
} &
log "Running rsync on source that will connect to destination"
env DOCKER_HOST=$src_docker docker exec -e RSYNC_PASSWORD="$g_password" -w /data "$src_did" \
rsync -aivxsAHSX --progress /data/ rsync://root@localhost/data
Another the really nice thing about that approach, is that you can copy data between two remote hosts, without ever storing the data locally. I also share the script ,docker-rsync-volumes
that I've written around this idea. With that script, copying volume from two remote hosts is just simple ,docker-rsync-volumes --delete -f ssh://user@productionserver grafana_data -t ssh://user@backupserver grafana_data_backup
.
docker run --volumes-from <data container> ubuntu tar -cO <volume path> | gzip -c > volume.tgz
This does not rely on implementation details of the volumes. And import the data with tar on the second machine. – Ga