Nix Build Caching Inside Docker Containers
Nix is a very useful tool for CI because it provides a portable and consistent build environment with accurate caching. I use it for almost all of my projects and have codified my setup in https://gitlab.com/kevincox/nix-ci/. This is a ready-to-go CI template for GitLab that uses Nix. While it supports binary caches (only Cachix at the moment), re-downloading all of the dependencies for each job can still take a minute or two. Not only is this a waste of time and resources but it also increased the odds that a small blip—whether on the runner or the binary cache—would cause a transient build failure. Since I am using self-hosted runners for a number of projects (my CPUs are a lot faster than those available on the shared runners) I wanted to see if I could cache dependencies without uploading, downloading or copying them around for each job.
GitLab supports disk-based caches for self-hosted runners but using it with a Nix-based image such as nix-ci was problematic. The most obvious solution won’t work. You can declare
/nix to be a cache volume that will be reused across jobs, but this has a couple of problems.
# GitLab Runner Config [[runners]] executor = "docker" [runners.docker] volumes = ["/nix"]
One problem with this approach is that two concurrently running jobs will get different
/nix volumes. While the volumes will be reused for future jobs, it means that the probably of a cache hit drops with the number of parallel jobs.
The bigger problem is that because the container is Nix-based it has its own files at
/nix. This seems to work for the first job, and for subsequent jobs with the same image. However, jobs with a different image will fail with confusing “file not found” errors as the cache volume shadows the image files.
The solution to this was to take advantage of the “chroot-store” feature of Nix. By passing the
--store /some/path flag to
nix-build it runs the build using the passed path as the store.
nix-build --store=/tmp/store can be thought of as logically equivalent to
chroot /tmp/store -- nix-build. But
nix-build --store reads the
nix-store executable, source expressions and Nix config from outside the chroot before running the build inside.
I updated nix-ci to use a chroot store (located at
/mnt/nix) for the build. Now that the “build store” didn’t conflict with the “image store” I could mount a cache-volume there and benefit from local caching across builds.
[[runners]] executor = "docker" [runners.docker] volumes = ["/mnt/nix"]
Cache volumes work fairly well but aren’t optimal with concurrent jobs. The GitLab Runner will create multiple volumes when jobs run concurrently which results in multiple independent stores. Not only does this waste disk space but a rebuild or redownload may occur just because the required path is in a different cache volume.
This can be resolved by bind-mounting all jobs to the same volume, but concurrent writes will cause errors in this scenario as the different jobs aren’t coordinating.
These issues can be resolved if you are running a nix-daemon on the host. You can then bind the host store to the chroot store used for the build. This way jobs will have access to all paths available on the host and the daemon will be used to build new paths.
Not only does this work correctly for concurrent writes, but the daemon also manages concurrency intelligently over the whole host. For example,
jobs will apply across all CI jobs instead of each CI job starting a job per core and overloading the machine.
[[runners]] executor = "docker" [runners.docker] volumes = ["/nix:/mnt/nix:ro"]
You can then tell nix to use the host daemon with:
nix-build --store 'unix:///mnt/nix/var/nix/daemon-socket/socket?root=/mnt'
One alternative would be to build the image with an alternate store path. However, this requires compiling all packages from source. Doing this will require a lot of resources every time you update instead of simply downloading the prebuilt packages from cache.nixos.org.
My particular use case is GitLab CI which uses Docker images for running jobs but I think this technique can be used any time you need to run a Nix build inside a container based on Nix. Also, this feature doesn’t really appear to be documented so I wanted to share a basic example.