Update: I now recommend using artifacts
with a short expire_in
. This is superior to cache
because it only has to write the artifact once per pipeline whereas the cache is updated after every job. Also the cache is per runner so if you run your jobs in parallel on multiple runners it's not guaranteed to be populated, unlike artifacts which are stored centrally.
Gitlab CI 8.2 adds runner caching which lets you reuse files between builds. However I've found this to be very slow.
Instead I've implemented my own caching system using a bit of shell scripting:
before_script:
# unique hash of required dependencies
- PACKAGE_HASH=($(md5sum package.json))
# path to cache file
- DEPS_CACHE=/tmp/dependencies_${PACKAGE_HASH}.tar.gz
# Check if cache file exists and if not, create it
- if [ -f $DEPS_CACHE ];
then
tar zxf $DEPS_CACHE;
else
npm install --quiet;
tar zcf - ./node_modules > $DEPS_CACHE;
fi
This will run before every job in your .gitlab-ci.yml
and only install your dependencies if package.json
has changed or the cache file is missing (e.g. first run, or file was manually deleted). Note that if you have several runners on different servers, they will each have their own cache file.
You may want to clear out the cache file on a regular basis in order to get the latest dependencies. We do this with the following cron entry:
@daily find /tmp/dependencies_* -mtime +1 -type f -delete