Incremental `docker image save <images> | xz -zc - > images.tar.xz`

We have a Docker Compose project including various services, some of which share common base images. After building all images, one of our build job's post-build-steps is to docker image save <images> | xz -zc - >images.tar.xz to create a single compressed archive of all the images – to be used in an offline-deployment fallback-strategy (so we can transport these images via USB- or CD-media rather than a Docker registry).

The uncompressed docker image save <images> tar-stream is about 2 GB in size. After piping it through xz, the compressed images.tar.xz is only about 500 MB.

This build job is run very often, and most of the time only a few images will have changed. However, the aforementioned docker … | xz … pipeline will always recreate the images.tar.xz in its entirety, which requires the most time in the overall build job. I'd like to optimize that.

Is there a way to speed up incremental builds?

I thought about docker image save <imageN> | xz -zc - > imageN.tar.xz each image individually, so I can save only modified images, but this will result in about twice as much required storage, because docker image save will include duplicate base images between individual calls.

I would very much like to be able to use a single docker image save <images> invocation, but only update or re-compress the actual changes in a previous images.tar.xz. I know that, because of how tar.xz is structured, small changes – especially at the beginning of the stream – will require to recreate the whole file non-the-less. However, I'd gladly see another solution that involves splitting the tar stream reasonably, such that individual parts can be updated.

Note: Aside from some meta/manifest files at the end, the tar-stream contains a bunch of layer folders, each of which contains a layer.tar and some meta files, corresponding to the (de-duplicated) layers of all the saved images, e.g.:

0166389787802d9a6c19a832fcfe976c30144d2430e798785110d8e8e562dab6/
0166389787802d9a6c19a832fcfe976c30144d2430e798785110d8e8e562dab6/VERSION
0166389787802d9a6c19a832fcfe976c30144d2430e798785110d8e8e562dab6/json
0166389787802d9a6c19a832fcfe976c30144d2430e798785110d8e8e562dab6/layer.tar
...(~100x4)...
fa498ee40da8c70be99b8f451813d386b45da891353d7184cdb8dd1b40efca03/
fa498ee40da8c70be99b8f451813d386b45da891353d7184cdb8dd1b40efca03/VERSION
fa498ee40da8c70be99b8f451813d386b45da891353d7184cdb8dd1b40efca03/json
fa498ee40da8c70be99b8f451813d386b45da891353d7184cdb8dd1b40efca03/layer.tar
ffb2e673ba3e63b6b5922a482783b072759f0b83335a5ffab0b36dc804a24b93/
ffb2e673ba3e63b6b5922a482783b072759f0b83335a5ffab0b36dc804a24b93/VERSION
ffb2e673ba3e63b6b5922a482783b072759f0b83335a5ffab0b36dc804a24b93/json
ffb2e673ba3e63b6b5922a482783b072759f0b83335a5ffab0b36dc804a24b93/layer.tar
manifest.json
repositories

PS: I'm already using pxz instead of xz to utilize all CPU cores during compression, but it still takes a considerable amount of time.

Recommended topics

Hot tags