Skip to main content

ghc-musl Part 2: Earthly

The ghc-musl project uses Earthly to build and test Docker images, as well as update the project README. To quote the homepage, “It’s like Makefile and Dockerfile had a baby.” I have been aware of the project for some time but have never really checked it out or tried it until yesterday. I did not consider Earthly as an alternative in The Make Dilemma because I would rather not require use of Docker in my build scripts, but it has been interesting to think about the possibility.

Installation

Earthly is very easy to install on Linux. I first planned on installing the earthly package from AUR, but it requires a Go compiler to build the executable, and I prefer to not install Go via the package manager. (I generally prefer to install development software separately from the OS package manager.) Since I am just trying it out at this stage, I simply downloaded the binary from GitHub and linked to it from ~/bin.

Earthly has a “bootstrap” phase that installs a earthly/buildkitd image. A buildkitd container is started automatically when you use the software, and it stays running. Having to run a daemon on my system in order to use Earthly is a significant drawback!

First Exposure

The ghc-musl README says to use the following command to “build and test all images, and generate an updated README.md:”

earthly --allow-privileged --artifact +all/README.md

This worked without error after fixing the issue described in ghc-musl Part 1. I wanted to try out the GHC 9.2.2 image, but the images were not retained! That command apparently builds and tests the images but then discards them. It actually took me hours of reading the documentation and experimenting before I finally understood the situation!

Initially, I was unsuccessful at altering the above command to also retain the build images. I tried many things, from the promising --image option to things that are unlikely to work but might give me a clue about what is going on. I figured that using the --push option could be used to push the built images to a registry, but I wanted to test the images locally.

I noticed that the all target has a FROM command. I wondered if the images do not show up in my host Docker environment because they are created in an isolated environment (container). I tried refactoring the Earthfile so that the all target does not use a FROM command. I started to get image output!

I started to draft a new issue to ask about the correct way to retain images. While documenting my attempt, I ran across some documentation that made me realize what I was doing wrong! The earthly command documentation shows that there are different “forms” of the earthly command. This is also shown in the --help output.

USAGE:
     earthly [options] <target-ref>

     earthly [options] --image <target-ref>

     earthly [options] --artifact <target-ref>/<artifact-path> [<dest-path>]

     earthly [options] command [command options]

It looks like the --artifact and --image options implement separate commands! They are (sort of) options because they can be specified more than once, but they are also (sort of) commands. This CLI design has significant room for improvement! I was getting image output with my refactored Earthfile only because I was testing it with a command that did not update the README! Stashing my changes, I confirmed that the follow command provides image output even with the original Earthfile.

earthly --allow-privileged +all

Note that I still do not know if it is possible to output both images and artifacts with a single command. I have only had success with doing this using separate commands.

Next, I wanted to figure out how to build and test specific images. For example, a developer may need to add some dependencies in order to be able to build specific software. It would be convenient to be able to develop such changes using a single version of GHC and build all versions after it is working, if necessary. Building all versions in every iteration is wasteful of both time and resources.

I started to draft a new issue to ask the ghc-musl maintainer how to do this, and to offer to help write documentation. As I drafted the issue, however, I felt like I already knew the answer. Though I am new to Earthly, those hours of trying to get images to output has given me a better understanding of the software. I suspect that the Earthfile needs to be refactored in order to provide separate targets for each image.

I refactored the Earthfile to try it out, and it works. I went ahead and made all improvements, given my current knowledge. I am not sure if the ghc-musl maintainer will like such a significant refactor, but it at least illustrates a number of improvements.

My Refactored Earthfile

This section goes through the whole refactored Earthfile. I am writing it to document the lessons learned while they are fresh in my mind. Code is shown before explanations about that code. Note that the code does not include comments, but I will add documentation in comments if the ghc-musl maintainer is interested in a pull request.

VERSION 0.6

The VERSION command is currently optional but will be mandatory in a future version of Earthly.

ARG ALPINE_VERSION=3.15.1
FROM alpine:$ALPINE_VERSION

The current Earthfile uses a FROM command in the all target, which is the only target used via the CLI. This refactored Earthfile has more than one target that is used via the CLI, so a top-level FROM command provides a common environment for all of them. The all target builds other targets, and separate environments are not needed for these delegated build commands.

The current Earthfile specifies the Alpine version in many locations. I do not know of a reason to use different versions for different images, so the refactored Earthfile specifies the version in a single location. Note that I use the variable name ALPINE_VERSION instead of just ALPINE for clarity.

ARG GHC_MUSL_VERSION=23
#ARG BASE_TAG=utdemir/ghc-musl:v$GHC_MUSL_VERSION
ARG BASE_TAG=extremais/ghc-musl:v$GHC_MUSL_VERSION

I renamed the VERSION variable to GHC_MUSL_VERSION to make it easily distinguishable from the VERSION command and ALPINE_VERSION.

I am using a custom BASE_TAG for my local tests, to ensure that my test images to not overwrite the official images. I implemented this in the Earthfile for testing, but note that the BASE_TAG can be set when building using a --build-arg option.

The current Earthfile uses the ENV command to set the BASE_TAG. This is inappropriate usage of the ENV command, which should only be used to set environment variables. The ARG command should be used instead.

These variables are declared at the top-level, outside of any target. Any change of a top-level variable causes the rebuild of any target (even if the target does not make use of the changed variable). For this reason, variables such as TEST_CABAL and TEST_STACK are deliberately not put at the top-level, as doing so would cause the base-system target to rebuild even when doing so is not necessary.

base-system:
  FROM alpine:$ALPINE_VERSION
  RUN apk update \
   && apk add \
        autoconf automake bash binutils-gold curl dpkg fakeroot file \
        findutils g++ gcc git make perl shadow tar xz \
   && apk add \
        brotli brotli-static \
        bzip2 bzip2-dev bzip2-static \
        curl libcurl curl-static \
        freetype freetype-dev freetype-static \
        gmp-dev \
        libffi libffi-dev \
        libpng libpng-static \
        ncurses-dev ncurses-static \
        openssl-dev openssl-libs-static \
        pcre pcre-dev \
        pcre2 pcre2-dev \
        sdl sdl-dev sdl-static \
        sdl2 sdl2-dev \
        sdl2_image sdl2_image-dev \
        sdl2_mixer sdl2_mixer-dev \
        sdl2_ttf sdl2_ttf-dev \
        sdl_image sdl_image-dev \
        sdl_mixer sdl_mixer-dev \
        xz xz-dev \
        zlib zlib-dev zlib-static \
   && ln -s /usr/lib/libncursesw.so.6 /usr/lib/libtinfo.so.6

The base-system target defines the common system layer of all of the images. The FROM command bases the layer on an Alpine image, and the RUN command runs the necessary commands to initialize the layer. My changes from the original Earthfile are trivial.

  • Each RUN command creates a separate layer, so it is generally a good practice to minimize them when used in images that are output. In this case, RUN commands should only be separated if a different command must be used in between them.
  • I kept the separate apk add commands. The first installs general system software requirements, while the second installs requirements for building static executables with various dependencies.
  • I like to sort package name arguments so that it is easy to check if a package is already in the list. Note that the arguments to the second apk add command are sorted by group, not individual package name.
  • Some packages are in both apk add commands. This does not really matter, so I kept them.
  • I changed the spacing to use less whitespace. Note that Earthly examples use two spaces for indentation.
ghc:
  FROM +base-system
  ARG --required GHC
  ENV GHCUP_INSTALL_BASE_PREFIX=/usr/local
  RUN curl --fail --output /bin/ghcup \
        'https://downloads.haskell.org/ghcup/x86_64-linux-ghcup' \
   && chmod 0755 /bin/ghcup \
   && ghcup upgrade --target /bin/ghcup \
   && ghcup install ghc "$GHC" --set \
   && ghcup install cabal --set \
   && /usr/local/.ghcup/bin/cabal update
  ENV PATH="/usr/local/.ghcup/bin:$PATH"

The ghc target installs GHCup, GHC, and Cabal, as well as updates the Cabal index, on top of the base system layer. This is split into multiple targets in the original Earthfile, but there is no need to do so. This single target minimizes layers and is easier to read because it takes much less space on the screen. In general, I think that every target should have a good reason for existing.

This target creates different layers for different versions of GHC, thanks to the ARG command.

I am not a fan of using GHCUP_INSTALL_BASE_PREFIX and /usr/local/.ghcup/bin, but I kept it because it works. Note that the use of the ENV command in this target is used to set environment variables within the container.

The original Earthfile specifies a version of Cabal to use per image. Each release of GHC has a minimum version of Cabal that it works with, but it is fine (preferred) to use newer versions when available. Installing the minimum version is good for testing compatibility, but these images are for building executables, not testing compatibility. I therefore removed Cabal version configuration altogether, using the ghcup command to always install the latest version. I also use the --set option instead of separate set commands.

The cabal update command uses an absolute path so that it can be included in the single RUN command (minimizing layers) before the PATH environment variable is set.

The result of the ghc target is the full image, but the image is not saved by this target because it should be saved after testing. If the image were saved before testing, then a broken image would overwrite any existing images.

test-cabal:
  FROM +ghc
  COPY example /example
  WORKDIR /example/
  RUN cabal new-build example --enable-executable-static
  RUN file $(cabal list-bin example) | grep 'statically linked'
  RUN echo test | $(cabal list-bin example) | grep 'Hello World!'

The test-cabal target implements the Cabal tests using a built image. It is based on the ghc target, but any side effects of the test are not included in output images.

The original Earthfile comments out the tests because the cabal list-bin command is only available in recent versions of Cabal. Since the refactored Earthfile always uses the most recent version of Cabal, these tests can be run. Note that the execution test requires an echo command to close STDIN since the example program reads from it.

I did not worry about combining RUN commands in tests because they are not included in output images.

test-stack:
  FROM +ghc
  RUN ghcup install stack --set
  COPY example /example
  WORKDIR /example/
  WITH DOCKER --load ghc-musl=+ghc
    RUN stack build \
          --ghc-options '-static -optl-static -optl-pthread -fPIC' \
          --docker --docker-image ghc-musl
  END
  RUN file $(find /example/.stack-work/install/ -type f -name example) \
    | grep 'statically linked'
  RUN echo test \
    | $(find /example/.stack-work/install/ -type f -name example) \
    | grep 'Hello World!'

The test-stack target implements the Stack tests using a built image. It is based on the ghc target, but any side effects of the test are not included in output images.

The original Earthfile uses an earthly/dind image, which facilitates Docker-in-Docker functionality. The WITH DOCKER command enables running Docker commands within a Docker container. I tried using the WITH DOCKER command without earthly/dind and was surprised to see that it works because Earthly automatically configures Docker in that case. I kept it only because it allows me to use the ghc target and use GHCUp to install Stack.

I added some tests that are equivalent to the tests run in the test-cabal target.

image:
  FROM +ghc
  ARG TEST_CABAL=1
  ARG TEST_STACK=1
  ARG --required TAG
  IF [ "$TEST_CABAL" = "1" ]
    BUILD +test-cabal
  END
  IF [ "$TEST_STACK" = "1" ]
    BUILD +test-stack
  END
  SAVE IMAGE --push "$TAG"

The image target only exists in order to run tests before saving images.

The TEST_CABAL and TEST_STACK variables are used to conditionally run the tests. Tests are run by default but can be disabled using --build-arg options. For example, the CI cannot run the Stack tests due to the use of --allow-privileged, so the following command can be used.

earthly --no-output --build-arg TEST_STACK=0 +all

Note that the SAVE IMAGE command only actually pushes images when the --push option is specified in the earthly command.

ghc922:
  BUILD +image --GHC=9.2.2 --TAG=$BASE_TAG-ghc922

ghc902:
  BUILD +image --GHC=9.0.2 --TAG=$BASE_TAG-ghc902

ghc8107:
  BUILD +image --GHC=8.10.7 --TAG=$BASE_TAG-ghc8107

Separate targets are provided for each supported GHC version. These targets run in the container created by the top-level FROM command and simply delegate to the image target using BUILD commands, passing along the required arguments.

For example, the following command can be used to build and test just the GHC 9.2.2 image.

earthly --allow-privileged +ghc922

Enabling this functionality was my primary goal of refactoring the Earthfile. It is easy to build and test a specified image without building all of them!

readme:
  RUN apk add bash gettext
  COPY ./update-readme.sh .
  RUN ./update-readme.sh \
        "$BASE_TAG-ghc922" \
        "$BASE_TAG-ghc902" \
        "$BASE_TAG-ghc8107"
  SAVE ARTIFACT README.md

The readme target implements the README update. It also runs in the container created by the top-level FROM command. The necessary software is installed, the script is copied, and the script is run with the appropriate tags. Changing or adding a supported GHC version requires making changes in more than one location, but I do not think that the Earthly “language” is powerful enough to resolve this issue. Perhaps it would be nice if it provided a way to load configuration from a JSON or YAML file? (/me ducks)

The script changes the README.md file within the container, and the SAVE ARTIFACT command copies it to the host. By putting this in a separate target, it is possible to run it separately from image building and testing, as follows.

earthly --artifact +readme/README.md

Note that the all target can be used in cases when building and testing should be done before saving the artifact.

all:
  BUILD +ghc922
  BUILD +ghc902
  BUILD +ghc8107
  BUILD +readme

The all target simply does everything by delegating to the various targets using the BUILD command.

Reflection

That was an interesting first experience of using Earthly. It has some really nice features as well as a lot of room for improvement. I am definitely not interested in using it to replace my Makefiles. I may consider using it for building Docker images, however. It seems to be a pretty good fit for the ghc-musl project. I think that the most convenient feature is how it manages intermediate/internal images.