How do you add items to .dockerignore?
Asked Answered
H

8

89

I'm not able to find many examples of what a .dockerignore file should look like.

Using puppet to install a few packages on a docker container causes the image to explode from 600MB to 3GB. I'm trying to use a .dockerignore file to keep the size to a minumum

$ cat Dockerfile  
FROM centos:centos6

#Work around selinux problem on cent images
RUN yum install -y --enablerepo=centosplus libselinux-devel

RUN yum install -y wget git tar openssh-server; yum -y clean all

Add Puppetfile / 
RUN librarian-puppet install
RUN puppet apply --modulepath=/modules -e "class { 'buildslave': jenkins_slave => true,}"
RUN librarian-puppet clean

If I run docker images --tree I can see that the image instantlly grows by several GB

$ docker images --tree
                ├─e289570b5555 Virtual Size: 387.7 MB
                │ └─a7646acf90d0 Virtual Size: 442.5 MB
                │   └─d7bc6e1fbe43 Virtual Size: 442.5 MB
                │     └─772e6b204e3b Virtual Size: 627.5 MB
                │       └─599a7b5226f4 Virtual Size: 627.5 MB
                │         └─9fbffccda8bd Virtual Size: 2.943 GB
                │           └─ee46af013f6b Virtual Size: 2.943 GB
                │             └─3e4fe065fd07 Virtual Size: 2.943 GB
                │               └─de9ec3eba39e Virtual Size: 2.943 GB
                │                 └─31cba2716a12 Virtual Size: 2.943 GB
                │                   └─52cbc742d3c4 Virtual Size: 2.943 GB
                │                     └─9a857380258c Virtual Size: 2.943 GB
                │                       └─c6d87a343807 Virtual Size: 2.964 GB
                │                         └─f664124e0080 Virtual Size: 2.964 GB
                │                           └─e6cc212038b9 Virtual Size: 2.964 GB Tags: foo/jenkins-centos6-buildslave:latest

I believe the reason that the image grows so large, is because librarian-puppet clones a puppet module to /modules which breaks the build cache

I've tried the following .dockerignore files with no luck.

$ cat .dockerignore
/modules
/modules/
/modules/*

Is this the correct syntax for a .dockerignore file?
Are there any other ways to prevent these containers from growing so large?

Additional information:

http://kartar.net/2013/12/building-puppet-apps-inside-docker/
http://danielmartins.ninja/posts/a-week-of-docker.html

Hullo answered 25/8, 2014 at 17:8 Comment(0)
S
57

.dockerignore is to prevent files from being added to the initial build context that is sent to the docker daemon when you do docker build, it doesn't create a global rule for excluding files from being created in all images generated by a Dockerfile.

It's important to note that each RUN statement will generate a new image, with the parent of that image being the image generated by the Dockerfile statement above it. Try collapsing your RUN statements into a single one to reduce image size:

RUN librarian-puppet install &&\
 puppet apply --modulepath=/modules -e "class { 'buildslave': jenkins_slave => true,}" &&\
 librarian-puppet clean
Substitution answered 9/9, 2014 at 15:26 Comment(3)
Thanks for your comment on my answer and this nice solution for the initial problem.Berserk
If those files are not needed to actually run the container, you can delete them as a final "cleanup" step in the Dockerfile. The resulting image be smaller, and you can also safely remove the intermediary build images to free disk space.Potpourri
@Abe Voelker: I tried collapsing all the RUN statements to a single RUN statement, but the image size is the same. why is that?Brannan
A
99

The .dockerignore file is similar to the .gitignore syntax. Here are some example rules:

# Ignore a file or directory in the context root named "modules"
modules

# Ignore any files or directories within the subdirectory named "modules" 
# in the context root
modules/*

# Ignore any files or directories in the context root beginning with "modules"
modules*

# Ignore any files or directories one level down from the context root named
# "modules"
*/modules

# Ignore any files or directories at any level, including the context root, 
# named modules
**/modules

# Ignore every file in the entire build context (see next rule for how this 
# could be used)
*

# Re-include the file or directory named "src" that may have been previously
# excluded. Note that you cannot re-include files in subdirectories that have 
# been previously excluded at a higher level
!src

Note that "build context" is the directory you pass at the end of your build command, typically a . to indicate the current directory. This directory is packaged from the docker client, excluding any files you have ignored with .dockerignore, and sent to the docker daemon to perform the build. Even when the daemon is on the same host as your client, the build only works from this context and not directly from the folders.

There is only a single .dockerignore for a build, and it must be in the root of the build context. It will not work if it is in your home directory (assuming you build from a subdirectory), and it will not work from a subdirectory of your build context.

To test what is in your current build context and verify your .dockerignore file is behaving correctly, you can copy/paste the following (this assumes you do not have an image named test-context, it will be overwritten and then deleted if you do):

# create an image that includes the entire build context
docker build -t test-context -f - . <<EOF
FROM busybox
COPY . /context
WORKDIR /context
CMD find .
EOF

# run the image which executes the find command
docker container run --rm test-context

# cleanup the built image
docker image rm test-context
Auspicate answered 11/5, 2019 at 0:8 Comment(0)
S
57

.dockerignore is to prevent files from being added to the initial build context that is sent to the docker daemon when you do docker build, it doesn't create a global rule for excluding files from being created in all images generated by a Dockerfile.

It's important to note that each RUN statement will generate a new image, with the parent of that image being the image generated by the Dockerfile statement above it. Try collapsing your RUN statements into a single one to reduce image size:

RUN librarian-puppet install &&\
 puppet apply --modulepath=/modules -e "class { 'buildslave': jenkins_slave => true,}" &&\
 librarian-puppet clean
Substitution answered 9/9, 2014 at 15:26 Comment(3)
Thanks for your comment on my answer and this nice solution for the initial problem.Berserk
If those files are not needed to actually run the container, you can delete them as a final "cleanup" step in the Dockerfile. The resulting image be smaller, and you can also safely remove the intermediary build images to free disk space.Potpourri
@Abe Voelker: I tried collapsing all the RUN statements to a single RUN statement, but the image size is the same. why is that?Brannan
B
30

The format of the .dockerignore is similar to the one of .gitignore. See a sample file and the docker documentation (but there are some differences - e.g. see they comment below)

The file should be a list of exclusion patterns (relative to the path of the .dockerignore file) separated by a newline.

So you should try the following .dockerignore:

modules/*

The / at the beginning may have been the mistake, as it will only be valid for the root directory of the file (but not for subdirectories, so maybe the recursive version without the / will do a better job instead).

Berserk answered 25/8, 2014 at 17:17 Comment(9)
Thanks for the answer. I updated the .dockerignore to the syntax suggested and rebuilt, it looks like the contianer is still growing to 3GB. Maybe I'm misunderstanding what the .dockerignore file is supposed to do.Hullo
@Hullo Hm, I must confess, I haven't done anything with Puppet so far, so I don't understand every command up there. But for me, it doesn't look like a .dockerignore related problem. The behavior you noticed seems strange to me and like it shouldn't be like it is. So you'd make me curious to know where the problem is, too. So feel free to open up a new general question where the problem might be or even open up an issue at the docker git repository. Perhaps someone experienced a similar behavior and knows the answer. For some better information, try to get some log files from the build.Berserk
.dockerignore is not exactly equivalent to .gitignore. For example, .gitignore has rules for evaluating directories that are not followed by .dockerignore; e.g. /log in a .gitignore will ignore the log directory from the same directory that .gitignore is in, but .dockerignore doesn't process the leading / the same way and will not exclude it from the build context. So you won't necessarily be able to copy the rules directly from your .gitignore into your .dockerignore unmodified.Substitution
One thing that bit me is that the newlines need to be Unix newlines, not Windows ones. I had a .dockerignore-file edited on Windows sent over to a Unix system and docker kept "ignoring" my ignores until I converted it to Unix style newlines!Northampton
An additional difference between .gitignore and .dockerignore is the trailing slash - it caught me out so thought I'd share. In .gitignore files a trailing slash is used to specifically identify a directory i.e something/ would exclude a directory named something, but not a file named something. .dockerignore files don't follow this rule so if you define a directory name with a trailing slash in your .dockerignore file, it's unrecognized i.e. not ignored. I have no idea how you would differentiate between a file and a directory in .dockerignore files so maybe someone else can tell us???Hoard
update - .dockerignore was added in docker 1.1.0. At the time of writing my ubuntu repo has docker 1.0.1 but the latest version of docker is 1.4.0. Not sure about the earlier implementations of the .dockerignore file but in docker 1.3.3 it doesn't support the trailing slashes on directory names meaning you can't differentiate between a filename and a directory name. If you add foo to the .dockerignore file, both files called foo and directories called foo would be ignored. As of 1.4.0, trailing slashes on directory names are supported, matching the .gitignore syntax.Hoard
@Hoard Thanks for the follow-up! Feel free to edit my answer to improve and/or update it, if you like. It's sad that the documentation from Docker doesn't tell us any more about the syntax of this file.Berserk
@Berserk yes the documentation sucks. I wasted hours & hours on problems with the .dockerignore not doing what I expected. Initially it was because my ubuntu docker version was 1.0.1 and didn't support the ignore file. I upgraded docker on the ubuntu server to 1.4.0 and then the ignore file was working with the trailing slashes on directories, but 1.4.0 is unstable and the docker builds kept failing. I downgraded docker to 1.3.3 then had problems with the ignore file again because of the slashes which I had to remove.Hoard
Another way they differ: in .gitignore you might have *.jar, while in .dockerignore it should be **/*.jarTe
L
25

Neither:

modules/*

nor

modules

would not work for me, docker kept on polluting the image with unnecessary files until I set it like this:

**/modules

also works with:

**/modules/*.py
Langobardic answered 2/8, 2017 at 15:59 Comment(2)
This should be the accepted answer, this syntax seems to be the only one respected by docker on windowsRingsmuth
Thank you, you saved me a lot of time. I have 17GB of pdf folder that doesn't make sense to rebuild again and over again.Sphagnum
T
3

A different way of doing it, creating a smaller image, is to run librarian-puppet in the host, not in Docker, so you don't end with librarian, ruby, gems,... installed in the image.

I ended with a 622MB image for jenkins slave using Puppet, and a 480MB image without Puppet.

Thyroid answered 18/9, 2014 at 10:44 Comment(0)
O
2

http://docs.docker.com/articles/dockerfile_best-practices/

It seems to me your approach is backwards (agreeing with @csanchez), and that you should be generating your docker container from puppet, not running puppet in the container...

Also, you should && the install/apply/clean lines together... each docker command creates an incremental image... If there are temporary/resource files that are part of the centos yum commands, you should likewise do the same.

FROM centos:centos6

# Add your needed files first
# maybe you could use a baseimage and make this a volume mount point?
Add Puppetfile / 

# combine multiple commands with cleanup of cache/temporary
# space in the same run sequence to reduce wasted diff image space.

#Work around selinux problem on cent images
RUN yum install -y --enablerepo=centosplus libselinux-devel && \
    yum install -y wget git tar openssh-server; yum -y clean all && \
    librarian-puppet install && \
    puppet apply --modulepath=/modules -e "class { 'buildslave': jenkins_slave => true,}" && \
    librarian-puppet clean

I'd REALLY suggest avoiding SELINUX in a container, it doesn't give you anything inside a container. Not to mention, that depending on what you are trying to create, there are smaller places to start from than centos6. I believe ubuntu is smaller, debian:wheezy smaller still, or even alpine for tiny start point.

It is worth noting, that your file size, if you're using a file system that supports virtual mounts, can reuse the same base image for multiple instances, so it won't grown more

Osgood answered 14/4, 2015 at 22:46 Comment(0)
M
2

Optimizing container image size is the main goal behind the .dockerignore as it serve a purpose similar to your .gitignore as it reduces the latency and response time while providing services. It is true for deployment automation such as Puppet, SaltStack or Ansible. Timestamp defined for service execution deployment may be failed because of larger image size and low network bandwidth. So .dockerignore helps to make the size of image as small as possible.

You could place it into the build context directory which we specify at the end of a docker build command. The file follows glob pattern for files and directories to exclude those from the final build image.

Suppose I have a directory .img/ into my build context, and I want to exclude it while building image, I'll simply add the following line into .dockerignore file,

.img

And, if I want to exclude all files starts with . then simply, add the line,

.*

(Note: Don't confuse the Unix glob pattern is different than Regular expressions)

In addition, I'll exclude few more of my files from my build context,

.*
docs
my-stack.dab
docker-compose.overrride.yml
test*
*.md
!README.md

Here, *.md line excludes all markdown files(I have many markdown files into my project). But, I want to include README.md and no other markdown files. As our last line in above, we have added README.md with ! or exclude it while excluding all other markdown files.

So, with this we can reduce the overhead of your build image with the help of .dockerignore and leverage to make image size smaller.

Moslem answered 4/6, 2017 at 8:22 Comment(0)
C
0

I think the best solution for your use case is to use a Multi-stage build in your docker file. Your Dockerfile must be in an empty directory, and you run puppet in a disposable container.

From the link above:

With multi-stage builds, you use multiple FROM statements in your Dockerfile. Each FROM instruction can use a different base, and each of them begins a new stage of the build. You can selectively copy artifacts from one stage to another, leaving behind everything you don’t want in the final image.

Coley answered 10/5, 2019 at 22:14 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.