on in Docker
Accelerating Docker in Gitlab CI
One thing you may notice when building Docker is that builds may take quite a long time. This is especially true when Docker is used in CI/CD scenario, as it often results in no image cache being present (since a different CI worker is chosen on each build). Below you can see the typical time it takes to build a Gitlab pipeline without any caching involved.
Docker allows to address this caching issue using –cache-from option that enables us to load cached layers from an existing image. This way we can use the previously built image as a cache for the next CI build. So, let’s look at how we can improve the build times with this approach.
As you probably know, Gitlab uses YAML files in order to configure CI. A really
.gitlab-ci.yml for Docker would look something like
below. This file mostly remains the same since the actual ‘guts’ of the build
are all neatly contained within a
Dockerfile (isn’t Docker beautiful?).
As a brief step back,
image setting instructs Gitlab Runner to execute the
build in an environment that has Docker installed. Meanwhile,
instructs Gitlab Runner to enable Docker in Docker - allowing us to build new
Docker images from inside CI Docker container.
Ideally, adding the above-mentioned option should fix the caching issue and deliver fast builds. Thus, we modify our Gitlab CI configuration as follows:
However, if you do this, you will observe no cached layers being used and absolutely no speed improvement. The reason for this is simple - your cache is not populated. Even though Docker does attempt to use the cache from the image we configure, it fails as the image is not available locally! Consequently, we need to update our Gitlab CI configuration to also pull the image before the build. This looks something like this:
Now, we are finally ready to experience the fruits of our labour. As you can see, the build is now down to 30 seconds from about 2 minutes. While the speed increase may seem trivial for this simple case, more complex build processes can benefit greatly from this simple trick.
As you may have noticed, this section is titled ‘Simple builds’. The problem with the approach described above is that it simply does not work for multi-stage builds. Let’s figure out why exactly that is the case.
There is a rather detailed explanation of what a multi-stage build is and why
it is beneficial in my article about
containerising Jekyll website.
In a nutshell, multi-stage build enables us to exclude unnecessary layers
(i.e., used only during build time) from the final image thereby keeping the
total size down. As a side effect, this also means that
simply does not have the build-time layers available. In most extreme
circumstances, this leads to a long build process that ultimately uses the
cached layers at the very end.
We can fix this by pushing our first stage (build environment image) to the registry alongside the final image. Luckily, Gitlab Regisry supports multiple images being stored for the same repository.
Therefore, we simply need to apply the same technique to each stage of the
multi-stage build process. The example
.gitlab-ci.yml below shows exactly how
this can be achieved.
Now, we are again ready to experience the fruits of our labour. As you can see, the build is now down to just 1 minute from about 3 minutes on average.
Boom! This way you can accelerate your Docker builds for Gitlab CI with minimal changes. These general principles can also be applied to other CI systems, so happy Docker’ing and see you again!