Theory
First, some Concourse theory (at least as of v3.3.1):
People often talk about Concourse having a "cache", but misinterpret what that means. Every concourse worker has a set of volumes on disk which are left around, forming a volume cache. This volume cache contains volumes that have been populated by resource get
and put
and task outputs
.
People also often misunderstand how the docker-image-resource
uses Docker. There is no global docker server running with your Concourse installation, in fact Concourse containers are not Docker containers, they are runC containers. Every docker-image-resource
process (check
, get
, put
) is run inside of its own runC container, inside of which there is a local docker server running. This means that there's no global docker server that is pulling docker images and caching the layers for further use.
What this implies is that when we talk about caching with the docker-image-resource, it means loading or pre-pulling images into the local docker server.
Practice
Now to the options for optimizing build times:
load_base
Background
The load_base
param in your docker-image-resource
put
tells the resource to first docker load
an image (retrieved via a get
) into its local docker server, before building the image specified via your put
params.
This is useful when you need to pre-populate an image into your "docker cache." In your case, you would want to preload the image used in the FROM
directive. This is more efficient because it uses Concourse's own volume caching to only pull the "base" once, making it available to the docker server during the execution of the FROM
command.
Usage
You can use load_base
as follows:
Suppose you want to build a custom python image, and you have a git repository with a file ci/Dockerfile
as follows:
FROM ubuntu
RUN apt-get update
RUN apt-get install -y python python-pip
If you wanted to automate building/pushing of this image while taking advantage of Concourse volume caching as well as Docker image layer caching:
resources:
- name: ubuntu
type: docker-image
source:
repository: ubuntu
- name: python-image
type: docker-image
source:
repository: mydocker/python
- name: repo
type: git
source:
uri: ...
jobs:
- name: build-image-from-base
plan:
- get: repo
- get: ubuntu
params: {save: true}
- put: python-image
params:
load_base: ubuntu
dockerfile: repo/ci/Dockerfile
cache
& cache_tag
Background
The cache
and cache_tag
params in your docker-image-resource
put
tell the resource to first pull a particular image+tag from your remote source, before building the image specified via your put params.
This is useful when it's easier to pull down the image than it is to build it from scratch, e.g. you have a very long build process, such as expensive compilations
This DOES NOT utilize Concourse's volume caching, and utilizes Docker's --cache-from
feature (which runs the risk of needing to first perform a docker pull
) during every put
.
Usage
You can use cache
and cache_tag
as follows:
Suppose you want to build a custom ruby image, where you compile ruby from source, and you have a git repository with a file ci/Dockerfile
as follows:
FROM ubuntu
# Install Ruby
RUN mkdir /tmp/ruby;\
cd /tmp/ruby;\
curl ftp://ftp.ruby-lang.org/pub/ruby/2.0/ruby-2.0.0-p247.tar.gz | tar xz;\
cd ruby-2.0.0-p247;\
chmod +x configure;\
./configure --disable-install-rdoc;\
make;\
make install;\
gem install bundler --no-ri --no-rdoc
RUN gem install nokogiri
If you wanted to automate building/pushing of this image while taking advantage of only Docker image layer caching:
resources:
- name: compiled-ruby-image
type: docker-image
source:
repository: mydocker/ruby
tag: 2.0.0-compiled
- name: repo
type: git
source:
uri: ...
jobs:
- name: build-image-from-cache
plan:
- get: repo
- put: compiled-ruby-image
params:
dockerfile: repo/ci/Dockerfile
cache: mydocker/ruby
cache_tag: 2.0.0-compiled
Recommendation
If you want to increase efficiency of building docker images, my personal belief is that load_base
should be used in most cases. Because it uses a resource get
, it takes advantage of Concourse volume caching, and avoids needing to do extra docker pull
s.
Dockerfile's FROM
statement – Alfonsoalfonzo