How does one get the size of a Docker image before they pull it to their machine?
When you search for a docker image on Docker hub, there will be 2 tabs- Repo Info
and Tags
. Open Tags tab and you will see the sizes of all the types of images you can pull for that image.
- For image on Docker Hub:
curl -s -H "Authorization: JWT " "https://hub.docker.com/v2/repositories/library/<image-name>/tags/?page_size=100" | jq -r '.results[] | select(.name == "<tag-name>") | .images[0].size' | numfmt --to=iec-i
For images on other registry like Microsoft Container Registry
Push the image to Docker Hub and you can get the compressed size of the image on Docker Hub website.
Use
docker save
to save image to a .tar file and then compress it a .tar.gz file.
docker save my-image:latest > my-image.tar
# Compress the .tar file
gzip my-image.tar
# Check the size of the compressed image
ls -lh my-image.tar.gz
- To manually view the manifest data
Use docker manifest inspect
to observe the manifest data, which shows you the compressed size of the image.
You need to first enable it by editing
~/.docker/config.json
file and setexperimental
toenable
. Example:{ "experimental": "enabled" }
. More info at official docs.Issue
docker manifest inspect -v <registry-domain>/<image-name>
and see add thesize
for the layers but only for your specific architecture (e.g.amd64
).
docker manifest inspect -v <registry-domain>/<image-name> | grep size | awk -F ':' '{sum+=$NF} END {print sum}' | numfmt --to=iec-i
Noted:
- It's the compressed size of the layers, not their on-disk size on your server.
- If the image is a multi-arch image (e.g.
alpine
linux containsarm
,amd64
and several architectures), then you'll get the total of those while in actual usagedocker
only uses the relevantarch
.
numfmt: invalid suffix in input: ‘4.37477e+09’
Piped output to printf to get around numfmt error. Also you could add --insecure
if checking an insecure registry. docker manifest inspect -v [--insecure] <registry-domain>/<image_name> | grep size | awk -F ':' '{sum+=$NF} END {print sum}' | xargs printf "%f\n" | numfmt --to=iec-i
–
Mast docker save my-image:latest | gzip > test.tar.gz
and check size with du -h test.tar.gz
–
Iceland Docker Hub
Get the compressed size in bytes of an image specific tag.
# for an official image the namespace is called library
curl -s https://hub.docker.com/v2/repositories/library/alpine/tags | \
jq '.results[] | select(.name=="latest") | .full_size'
# 2796860
# here the project namespace is used
curl -s https://hub.docker.com/v2/repositories/jupyter/base-notebook/tags | \
jq '.results[] | select(.name=="latest") | .full_size'
# 187647701
Get the compressed size in bytes of an image tag for a specific architecture / os.
# selecting an architecture
curl -s https://hub.docker.com/v2/repositories/library/alpine/tags | \
jq '.results[] | select(.name=="latest") | .images[] | select (.architecture=="amd64") | .size'
# 2796860
# selecting an architecture and a specific os
curl -s https://hub.docker.com/v2/repositories/library/hello-world/tags | \
jq '.results[] | select(.name=="latest") | .images[] | select (.architecture=="amd64" and .os=="linux") | .size'
# 2529
Alternative
An alternative is to use the experimental docker manifest inspect
command. The advantage is that it does not rely on Docker Hub, it works as well with other registries because it's based on the Image Manifest specification.
# activate experimental mode
export DOCKER_CLI_EXPERIMENTAL=enabled
docker manifest inspect -v alpine:latest | \
jq '.[] | select(.Descriptor.platform.architecture=="amd64") | .SchemaV2Manifest.layers[].size'
# 2796860
# need to sum if multiple layers
docker manifest inspect jupyter/base-notebook:latest | \
jq '[.layers[].size] | add'
# 187647701
Getting compressed image size before pull for any registry that serves Image Manifest V2:
- Uses
docker manifest inspect
(available by default in recent Docker versions) - Parses and sums layer sizes from the manifest using
jq
- Formats sizes to
iec
standard usingnumfmt
(notsi
, sizes in manifests are 1024-based) - Supports multi-arch manifests
$ dockersize() { docker manifest inspect -v "$1" | jq -c 'if type == "array" then .[] else . end' | jq -r '[ ( .Descriptor.platform | [ .os, .architecture, .variant, ."os.version" ] | del(..|nulls) | join("/") ), ( [ .SchemaV2Manifest.layers[].size ] | add ) ] | join(" ")' | numfmt --to iec --format '%.2f' --field 2 | sort | column -t ; }
$ dockersize mcr.microsoft.com/dotnet/core/samples:dotnetapp-buster-slim
linux/amd64 72.96M
$ dockersize ghcr.io/ddelange/pycuda/runtime:3.9-master
linux/amd64 1.84G
linux/arm64 1.80G
$ dockersize python
linux/amd64 334.98M
linux/arm/v5 308.21M
linux/arm/v7 295.69M
linux/arm64/v8 326.32M
linux/386 339.74M
linux/mips64le 314.88M
linux/ppc64le 343.86M
linux/s390x 309.52M
windows/amd64/10.0.20348.825 2.20G
windows/amd64/10.0.17763.3165 2.54G
- For Debian users:
apt-get install jq
- For Mac users:
brew install coreutils jq
(coreutils shipsnumfmt
)
dockersize rhub/r-minimal:latest
fails with jq: error (at <stdin>:1): Cannot iterate over null (null)
. –
Acetylcholine You can simply get all information about the image by the docker hub http api like that :
https://hub.docker.com/v2/repositories/library/<image_name>/tags/
for example this url https://hub.docker.com/v2/repositories/library/couchdb/tags/latest
return all information about the latest couchdb
image
{
"creator": 2215,
"id": 2110662,
"images": [
{
"architecture": "amd64",
"features": "",
"variant": null,
"digest": "sha256:12c59b7f8b202476487c670ba7a042b3a654cd91302335df1bfdff0197f92968",
"os": "linux",
"os_features": "",
"os_version": null,
"size": 87486230,
"status": "active",
"last_pulled": "2022-07-25T17:18:08.778757Z",
"last_pushed": "2022-07-12T15:29:34.553194Z"
},
{
"architecture": "arm64",
"features": "",
"variant": "v8",
"digest": "sha256:5985d1edef0613a93f2d7349a7cac2296ec956674df1f96d70d4eb23e83e6f80",
"os": "linux",
"os_features": "",
"os_version": null,
"size": 85620381,
"status": "active",
"last_pulled": "2022-07-25T15:30:42.25162Z",
"last_pushed": "2022-07-12T03:05:55.128925Z"
},
{
"architecture": "ppc64le",
"features": "",
"variant": null,
"digest": "sha256:f667d0d4bf4acaeade0cf8510a4a1384ccd66c08e4ab0b678afb7f8651b9df41",
"os": "linux",
"os_features": "",
"os_version": null,
"size": 93209105,
"status": "active",
"last_pulled": "2022-07-25T12:28:16.460623Z",
"last_pushed": "2022-07-12T05:26:46.295412Z"
}
],
"last_updated": "2022-07-12T15:30:03.436304Z",
"last_updater": 1156886,
"last_updater_username": "doijanky",
"name": "latest",
"repository": 545837,
"full_size": 87486230,
"v2": true,
"tag_status": "active",
"tag_last_pulled": "2022-07-25T17:18:08.778757Z",
"tag_last_pushed": "2022-07-12T15:30:03.436304Z"
}
the image size (in byet) is : 87486230
if you need only the full size you can get it fetch the json and extract the property full_size by any json processor like jq
or other
for example :
curl -s https://hub.docker.com/v2/repositories/library/couchdb/tags/latest | jq 'select(.name=="latest") | .full_size'
//87486230
If you really look into the docker code for pull operation, I think your answer is there. If the image of the container is not cached, then during pulling of the image, docker first collects the information about the image from the registry like number of layers, size of each layers etc. etc.
I would refer to read this file.
https://github.com/moxiegirl/docker/blob/master/distribution/xfer/download.go
© 2022 - 2024 — McMap. All rights reserved.