on in Docker
Containerising Jekyll website
I am a big fan of Docker as the tool to create predictable development, CI and production environments. The ability to build and/or run pretty much any software with just Docker installed is stunning and outstandingly useful. This is why I utilise Docker for my private server to run all the required services. In this blog post we will look at how I containerised this blog: applying Docker for both building and execution.
Building with Docker
When working with any automated build process, the aim of the developer is
to capture the build instructions. Docker provides a standardised way to
achieve this with
Dockerfile. What makes it even more powerful is that Docker
enables you to capture the build environment as well - guaranteeing fast,
repeatable and predictable builds.
In order to capture the build environment, we will derive from jekyll/jekyll Docker image. This will provide us with a *nix system that has Jekyll of the supplied version installed. Build instructions can be easily picked up from the official Jekyll documentation and essentially boil down to the following.
destination folder will contain a plain HTML static website that we can
then serve with any web server. Therefore, our
Dockerfile would look
something like the following.
jekyll serve starts a built-in development server and shouldn’t be used
for delivering websites in production. Instead we should opt for a fully
featured standalone web server. Personally, I like to use nginx due to its
performance and configurability. The question becomes: how exactly can we
utilise nginx in the environment we defined?
Previously, I only ever containerised apps that don’t require building. In such cases, we can simply derive from a base production image and copy our app into the container (as the example above does). This method, however, is not scalable when containerising an app that does require building, as you would have to mould your container to support both building and execution. This violates the ‘separation of concerns’ principle and blows up the size of your Docker image unnecessarily.
For example, in our case this would mean taking
jekyll/jekyll image with all
of its layers and installing nginx web server on top of it. We would likely
have to install software manually losing any official support and updates. What
is more, we would drag at least 432MB of unnecessary layers to production.
Fortunately, a new feature in Docker 17.05 called
comes to rescue. It enables us to utilise multiple containers throughout the
build process and this way produce lightweight images while taking advantage of
all the Docker features. Note that you might need to upgrade your Docker to
access this feature or you’ll encounter
Error parsing reference issue (looks
something like the one below). Also, if you are indeed on an older version, you
can read about my adventures while
upgrading Docker on Ubuntu.
With this feature, we can build our website in a
jekyll/jekyll container and
then copy the result to an
nginx container. Thus, our
Dockerfile would look
something like the one below.
Grappling with Jekyll image
jekyll/jekyll image defines default working directory
/srv/jekyll as a volume. This effectively leads to the results of
jekyll build being wiped out (volumes get recreated after the command
completes). In order to fix this issue, I had to change the default working
directory to something else. This resulted in the following
Dockerfile didn’t work and failed with the following
site directory does not exist by default, it is being created with
wrong permissions. This prevents
jekyll build from creating a destination
folder and fails the build as a whole. In order to fix this issue, we need to
change the permissions on this folder. After applying the fix, we get the
Dockerfile shown below.
This is a fully functional example and it does exactly what we wanted: builds
the website in
jekyll/jekyll container and produces a new container based on
nginx. Except… it is really slow!
Gotta go fast(er)
As you can see, it takes 2-3 minutes to build a container for my blog. This is really slow compared to the performance you would get if building the website locally. Looking at the output of the build daemon, we can see that Docker spends most of the time installing dependencies for each build.
Ideally, we would want this step to be cached, as Ruby dependencies of the
website will rarely change.
jekyll/jekyll suggests to use
caching with a volume,
which (in my opinion) is a really bad idea. Implementing this method goes
against the ‘predictable environment’ idea and is, frankly speaking, an
anti-pattern. Instead, we should look at structuring our image better.
Current problem is that restoring dependencies and performing a build
is combined in a single
RUN statement after
COPY is performed. This means
that Docker will need to re-create
RUN image layer (perform restore +
build) every time contents of the website changes. While this is desirable
for the build action, we don’t want this behaviour for the restore
action. We can achieve this by rewriting the
Dockerfile as below.
Needless to say, the first build will still be slow since this is when the cache is generated. Let’s see how much quicker (if at all) our build became after this optimisation was introduced.
As we can see, we achieved almost x10 increase in speed reducing the time to build an image to just 20 seconds.
Gotta go clean(er)
Given how much trouble
jekyll/jekyll image caused us and the fact that we
already pull Jekyll itself using
bundler install, we can replace our base
build image with something lower level. Since the original build image uses
Ruby 2.5, we can derive from
ruby:2.5 Docker image. Changing the base image
also enabled us to remove permission fix we introduced earlier.
Unfortunately, this fails with
that Jekyll apparently has a hard dependency on).
which is as simple as running
apt-get since the chosen base image is built
on top of Debian-based distribution. Following the
Docker best practices,
we updated the
Dockerfile as seen below.
In the final result, I also added
ENV JEKYLL_ENV=production that indicates to
Jekyll that a website must be built in production mode. Voilà! This way you
can containerise your Jekyll website with ~10 simple lines of code. These
general principles can be applied to other software as well, so happy
Docker’ing and see you again!