How to clean up Docker ZFS legacy shares

Asked 18/9, 2018 at 13:3 Answered 10/2, 2023 at 16:41

Summary

Given that:

The storage driver docker users is ZFS;
Only docker creates legacy datasets;

Bash:

$ docker ps -a | wc -l
16

$ docker volume ls | wc -l
12

$ zfs list | grep legacy | wc -l
157

16 containers (both running and stopped). 12 volumes. 157 datasets. This seems like an awful lot of legacy datasets. I'm wondering if a lot of them are so orphaned that not even docker knows about them anymore, so they don't get cleaned up.

Rationale

There is a huge list of legacy volumes in my Debian zfs pool. They started appearing when I started using Docker on this machine:

$ sudo zfs list | grep legacy | wc -l
486

They are all in the form of:

pool/var/<64-char-hash>                  202K  6,18T   818M  legacy

This location is used solely by docker.

$ docker info | grep -e Storage -e Dataset
Storage Driver: zfs
 Parent Dataset: pool/var

I started cleaning up.

$ docker system prune -a
  (...)
$ sudo zfs list | grep legacy | wc -l
154

That's better. However, I'm only running about 15 containers, and after running docker system prune -a, the history or every container shows that only the last image layer is still available. The rest are <missing> (because they are cleaned up).

$ docker images | wc -l
15

If all containers use only the last image layer after pruning the rest, shouldn't docker only use 15 image layers and 15 running containers, totalling 30 volumes?

$ sudo zfs list | grep legacy | wc -l
154

Can I find out if they are in use by a container/image? Is there a command that traverses all pool/var/<hash> datasets in ZFS and figures out to what docker container/image they belong? Either a lot of them can be removed, or I don't understand how to figure out (beyond just trusting docker system prune) they cannot.

The excessive use of zfs volumes by docker messes up my zfs list command, both visually and performance-wise. Listing zfs volumes now takes ~10 seconds in stead of <1.

Proof that docker sees no more dangling counts

$ docker ps -qa --no-trunc --filter "status=exited"
  (no output)
$ docker images --filter "dangling=true" -q --no-trunc
  (no output)
$ docker volume ls -qf dangling=true
  (no output)

zfs list example:

NAME                                                                                       USED  AVAIL  REFER  MOUNTPOINT
pool                                                                                      11,8T  5,81T   128K  /pool
pool/var                                                                                   154G  5,81T   147G  /mnt/var
pool/var/0028ab70abecb2e052d1b7ffc4fdccb74546350d33857894e22dcde2ed592c1c                 1,43M  5,81T  1,42M  legacy
pool/var/0028ab70abecb2e052d1b7ffc4fdccb74546350d33857894e22dcde2ed592c1c@211422332       10,7K      -  1,42M  -
# and 150 more of the last two with different hashes

Lance answered 18/9, 2018 at 13:3 Comment(7)

Have you tried the instructions suggested here? #35656349 – Sirocco 19/9, 2018 at 5:23

I did now, thanks for the suggestion. Unfortunately it doesn't work for finding what mounts are used for images or layers. It only finds containers with certain volumes, e.g. the ones in docker volume ls - which are only about 15 volumes (as expected) – Lance 21/9, 2018 at 9:20

I did read your question over 10 times. Then I realized maybe you didn't mean volumes in Docker at all. Since volumes in docker cannot appear from air, we must specify them by '-v' flag. Could u plz put piece of content of 'sudo zfs list'? May be I should edit my answer below after that... – Calciferol 25/9, 2018 at 3:41

@Calciferol They are most probably also Docker layers, because of the zfs storage driver. But I'm suspecting they are orphaned from an ancient version. See example output edited at the end of the question. – Lance 26/9, 2018 at 19:25

@Lance I don’t have experience on Docker with zfs. But I am wandering if the mountpoint are legacy when you run a container? Or somehow it got changed later? – Calciferol 27/9, 2018 at 1:14

@Redsandro: Can confirm the behavoir with the lastest Docker, did the prune dance, looks like a (yet another) bug in Docker :/ This worked (but kills all volumes/images/etc.pp) for Docker: zfs list -r rpool/docker | awk '/docker\// { print $1 }' | xargs -l zfs destroy -R Replace rpool/docker with your local Docker dataset. – Diagnostics 12/2, 2019 at 18:38

Things become additionally difficult when running zfs-auto-snapshot, as all those many ZFS datasets created by Docker get additionally snapshotted regularly. Not sure yet if I like that. :-) – Hanni 20/2, 2023 at 9:31

I had the same question but couldn't find a satisfactory answer. Adding what I eventually found, since this question is one of the top search results.

Background

The ZFS storage driver for Docker stores each layer of each image as a separate legacy dataset.

Even just a handful of images can result in a huge number of layers, each layer corresponding to a legacy ZFS dataset.

Quote from the Docker ZFS driver docs:

The base layer of an image is a ZFS filesystem. Each child layer is a ZFS clone based on a ZFS snapshot of the layer below it. A container is a ZFS clone based on a ZFS Snapshot of the top layer of the image it’s created from.

Investigate

You can check the datasets used by one image by running:

 $ docker image inspect [IMAGE_NAME]

Example output:

...
"RootFS": {
    "Type": "layers",
    "Layers": [
        "sha256:f2cb0ecef392f2a630fa1205b874ab2e2aedf96de04d0b8838e4e728e28142da",
        ...
        ...
        ...
        "sha256:2e8cc9f5313f9555a4decca744655ed461e21fbe48a0f078ed5f7c4e5292ad2e",
    ]
},
...

This explains why you can see 150+ datasets created when only running a dozen containers.

Solution

Prune and delete unused images.
```
$ docker image prune -a
```
To avoid a slow zfs list, specify the dataset of interest.
Suppose you store docker in tank/docker and other files in tank/data. List only the data datasets by the recursive option:
```
# recursively list tank/data/*
$ zfs list tank/data -r
```

Jolenejolenta answered 18/3, 2020 at 1:30 Comment(1)

I created a small python tool that lists zfs datasets as a tree, with used size, and snapshots laid out in boxes. It helped me clean up that kind of mess. At least I could visualise it. github.com/vizyon-sa/zfs-tree – Elwaine 30/9, 2023 at 21:55

I use docker-in-docker containers that also generates a lot of unused snapshot.

based on @Redsandro comment I've used the following command

sudo zfs list -t snapshot -r pool1| wc -l
sudo zpool list

(sudo zfs get  mounted |grep "mounted   no" | awk '/docker\// { print $1 }' |  xargs -l sudo zfs destroy -R ) 2> /dev/null

as just delete all snapshot ruined the consistency of docker. But as docker mounts all images that uses under /var/lib/docker/zfs/graph (same for the docker-in-docker images) so ignoring those that mounted only should delete dangling images/volumes/containers that was not properly freed up. You need to run this till the number of snapshot decreasing.

Phyllous answered 10/2, 2023 at 16:41 Comment(1)

So... this removes zfs snapshots, but does not remove layers in docken registry, and now I have to fix our docker host. Well at least I freed up 500GB of disk space. – Malo 6/2 at 14:4

Prune introductions on docker.com.

I assume your docker version is lower than V17.06. Since you’ve executed docker system prune -a, the old layers’ building information and volumes are missing. And -a/--all flag means all images without at least one container would be deleted. Without -a/--all flag, just dangling images would be deleted.

In addition, I think you have misunderstanding about <missing> mark and dangling images. <missing> doesn't mean that the layers marked as missing are really missing. It just means that these layers may be built on other machines. Dangling images are non-referenced images. Even the name and tag are marked <none>, the image still could be referenced by other images, which could check with docker history image_id.

In your case, these layers are marked as missing, since you have deleted the old versions of images which include building information. You said above--only latest version images are available--thus, only the latest layer are not marked missing.

Note this: docker system prune is a lazy way to manage all objects(image/container/volume/network/cache) of Docker.

Calciferol answered 22/9, 2018 at 9:46 Comment(2)

Thank you for thinking with me. However, system prune implies volume prune so I already did that. Just to be sure, I did a volume prune. It returns "Total reclaimed space: 0B". – Lance 22/9, 2018 at 15:35

@Lance Aha yes, so far it seems no undo operation for system prune. Next time, try volume prune instead. – Calciferol 22/9, 2018 at 15:46

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++