Why does chown increase size of docker image?
Asked Answered
C

6

27

I can't understand why the 'chown' command should increase the size of my docker image?

The following Dockerfile creates an image of size 5.3MB:

FROM alpine:edge
RUN adduser example -D -h /example -s /bin/sh

This example however creates an image of size 8.7MB:

FROM alpine:edge
RUN adduser example -D -h /example -s /bin/sh && \
    chown -R example.example /lib

Why?

Note: My actual dockerfile is of course much longer than this example and therefore the increase in image size is also quite larger. That's why I even care..

Cargill answered 6/5, 2015 at 19:13 Comment(0)
S
22

Every step in a Dockerfile generates a new intermediate image, or "layer", consisting of anything that changed on the filesystem from the previous layer. A docker image consists of a collection of layers that are applied one on top of another to create the final filesystem.

If you have:

RUN adduser example -D -h /example -s /bin/sh

Then you are probably changing nothing other than a few files in /etc (/etc/passwd, /etc/group, and their shadow equivalents).

If you have:

RUN adduser example -D -h /example -s /bin/sh && \
    chown -R example.example /lib

Then the list of things that have changed includes, recursively, everything in /lib, which is potentially larger. In fact, in my alpine:edge container, it looks like the contents of /lib is 3.4MB:

/ # du -sh /lib
3.4M    /lib

Which exactly accounts for the change in image size in your example.

UPDATE

Using your actual Dockerfile, with the npm install ... line commented out, I don't see any difference in the final image size whether or not the adduser and chown commands are run. Given:

RUN echo "http://nl.alpinelinux.org/alpine/edge/main" > /etc/apk/repositories && \
    echo "http://nl.alpinelinux.org/alpine/edge/testing" >> /etc/apk/repositories && \
    apk add -U wget iojs && \
    apk upgrade && \
    wget -q --no-check-certificate https://ghost.org/zip/ghost-0.6.0.zip -O /tmp/ghost.zip && \
    unzip -q /tmp/ghost.zip -d /ghost && \
    cd /ghost && \
#    npm install --production && \
    sed 's/127.0.0.1/0.0.0.0/' /ghost/config.example.js > /ghost/config.js && \
    sed -i 's/"iojs": "~1.2.0"/"iojs": "~1.6.4"/' package.json && \
#   adduser ghost -D -h /ghost -s /bin/sh && \
#   chown -R ghost.ghost * && \
    npm cache clean && \
    rm -rf /var/cache/apk/* /tmp/*

I get:

$ docker build -t sotest .
[...]
Successfully built 058d9f41988a
$ docker inspect -f '{{.VirtualSize}}' 058d9f41988a
31783340

Whereas given:

RUN echo "http://nl.alpinelinux.org/alpine/edge/main" > /etc/apk/repositories && \
    echo "http://nl.alpinelinux.org/alpine/edge/testing" >> /etc/apk/repositories && \
    apk add -U wget iojs && \
    apk upgrade && \
    wget -q --no-check-certificate https://ghost.org/zip/ghost-0.6.0.zip -O /tmp/ghost.zip && \
    unzip -q /tmp/ghost.zip -d /ghost && \
    cd /ghost && \
#    npm install --production && \
    sed 's/127.0.0.1/0.0.0.0/' /ghost/config.example.js > /ghost/config.js && \
    sed -i 's/"iojs": "~1.2.0"/"iojs": "~1.6.4"/' package.json && \
    adduser ghost -D -h /ghost -s /bin/sh && \
    chown -R ghost.ghost * && \
    npm cache clean && \
    rm -rf /var/cache/apk/* /tmp/*

I get:

$ docker build -t sotest .
[...]
Successfully built 696b481c5790
$ docker inspect -f '{{.VirtualSize}}' 696b481c5790
31789262

That is, the two images are approximately the same size (the difference is around 5 Kb).

I would of course expect the resulting image to be larger if the npm install command could run successfully (because that would install additional files).

Slacker answered 6/5, 2015 at 19:55 Comment(6)
Thanks. I follow your logic, since I change files already in the base image. But what if I downloaded a bunch of files as part of a RUN command and then did a chown on them (which is more the actual case) ? Example: getting a tar.gz file, extract it and chown recursivly, all in one layer. Why would that also add size? My actual Dockerfile: pastebin.com/raw.php?i=ZEf2NMjP (what adds size is commented out)Cargill
Well, you just added a bunch of files to the image, right? Or did I misunderstand your question?Slacker
Sorry for being unclear. I stripped down my Docker file for the sake of readability. Check out the pastebin actual file above. So in the example on top i did not add files, but this is what I actually do, and then I don't follow the logic, because i only edit files that I already added. Shouldn't give an extra overhead?Cargill
Where does sqlite3.tar.gz come from? I would like to build your image locally.Slacker
I just tried to COPY the sqlite3.tar.gz and then extract it and delete it in the RUN command. That only gave an overhead of ~10MB instead of ~23MB. Still I don't get what chown do that gives this extra size.Cargill
@Cargill Based on your Dockerfile (in the pasetbin), you're putting the content of sqlite3.tar.gz in /ghost/nodes_modules/ so when you do chown, you modify these in the new layer, so they are duplicated accros 2 layers. For the 2nd "try", with the COPY you will have the sqlite3.tar.gz, before extraction, and forever as it will be on one layer (so it's useless to remove it in the RUN command). One way would be to get it from a uri (wget in the RUN command)Stuckey
A
18

Since Docker 17.09 one can use the --chown flag on ADD/COPY operations in Dockerfile to change the owner in the ADD/COPY step itself rather than a separate RUN operation with chown which increases the size of the image as you have noted. It would have been good to have this as the default mode i.e. the permissions of the user copying the files are applied to the copied files. However, the Docker team did not want to break backward compatibility and hence introduced a new flag.

Read more in Docker's documentation on COPY and ADD

COPY --chown=<user>:<group> <hostPath> <containerPath>

The other alternatives are:

  1. Change the permission in a staging folder prior to building the image.

  2. Run the container via a bootstrap script that changes the ownership.

  3. Squash the layers!

Ambages answered 24/4, 2018 at 4:4 Comment(0)
A
3

This is a known problem unfortunately: https://github.com/docker/docker/issues/5505 and https://github.com/docker/docker/issues/6119#issuecomment-70606158

You can fix this by changing the docker storage driver from aufs to devicemapper as described in https://github.com/docker/docker/issues/6119#issuecomment-268870519

Antipodal answered 23/2, 2017 at 8:37 Comment(0)
B
2

Now you can inspect your image & its layers visually using ImageLayers.io to help see why the extra size is.

also, docker history shows similar stuff.

Brandes answered 8/5, 2015 at 17:48 Comment(0)
A
2

A thing to note about chown, is that Docker will still consider a directory to have changed owner after chown, even if the directory had already been owned by the same user before. Meaning, that something like this:

FROM ubuntu:latest
RUN mkdir hello
WORKDIR /hello
COPY . .
RUN groupadd userx && adduser userx --ingroup userx
RUN chown userx:userx -R /hello
RUN chown userx:userx -R /hello
RUN chown userx:userx -R /hello

will make the image size include "hello" 4 times, i.e. bloat it.

You might run into this problem if in a given parent directory some directories have already been chowned and some have not, but adding chown to a lot of other Docker commands makes things unreadable.

Therefore, at least in Ubuntu, one can run:

# `--from=root:root` here being essential
RUN chown userX:userX --from=root:root -R /hello

This will avoid adding already chowned files/directories to the new image layer.

Autogiro answered 6/12, 2022 at 6:59 Comment(0)
B
0

This is due to dockerfile authoring irregularities. When chown-R is used, a new layer is created to store ownership changes, resulting in a larger image size

The solution is to use the 'COPY --chown=hsadmin:hsadmin' directive, which means that when copying files to the container, the ownership of the files has been set to the 'hsadmin' user and group. Doing so avoids creating a new layer to change the ownership of the file after copying it.

FROM alpine:edge

ARG UID=1000
ARG GID=1000

RUN sed -i 's/dl-cdn.alpinelinux.org/mirrors.tuna.tsinghua.edu.cn/g' /etc/apk/repositories
RUN apk add --no-cache shadow

RUN addgroup -g $GID hsadmin && adduser -D -u $UID -G hsadmin hsadmin

RUN sed -i 's/dl-cdn.alpinelinux.org/mirrors.aliyun.com/g' /etc/apk/repositories && \
    mkdir -p /home/hsadmin/app

COPY --chown=hsadmin:hsadmin cloudEngine-v2.7.0.tar.gz /home/hsadmin/app
USER hsadmin

The image volume is 1.08 g

test        1.7      294cfb715748   6 minutes ago    1.08GB

Copy before authorize the directory, which creates a new layer in the Docker image to store ownership changes, thus increasing the size of the image.

FROM alpine:edge

ARG UID=1000
ARG GID=1000

RUN sed -i 's/dl-cdn.alpinelinux.org/mirrors.tuna.tsinghua.edu.cn/g' /etc/apk/repositories
RUN apk add --no-cache shadow

RUN addgroup -g $GID hsadmin && adduser -D -u $UID -G hsadmin hsadmin

RUN sed -i 's/dl-cdn.alpinelinux.org/mirrors.aliyun.com/g' /etc/apk/repositories && \
    mkdir -p /home/hsadmin/app

COPY cloudEngine-v2.7.0.tar.gz /home/hsadmin/app

RUN chown -R hsadmin:hsadmin /home/hsadmin/app
USER hsadmin

Increase in size

test          1.8          fbbb35320d83   3 seconds ago    2.14GB
Buoy answered 8/5 at 2:38 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.