ghc-musl Part 2: Earthly
The ghc-musl
project uses Earthly to build and
test Docker images, as well as update the project README. To quote the
homepage, “It’s like Makefile
and Dockerfile
had a baby.” I have been aware of the project for some time but have
never really checked it out or tried it until yesterday. I did not
consider Earthly as an alternative in The Make Dilemma because I
would rather not require use of Docker in my build scripts, but it has
been interesting to think about the possibility.
Installation
Earthly is very easy to install on Linux. I first planned on
installing the earthly package
from AUR, but it requires a Go compiler to build the executable, and I
prefer to not install Go via the package manager. (I generally prefer to
install development software separately from the OS package manager.)
Since I am just trying it out at this stage, I simply downloaded the
binary from GitHub and linked to it from ~/bin
.
Earthly has a “bootstrap” phase that installs a earthly/buildkitd
image. A buildkitd
container is started automatically when
you use the software, and it stays running. Having to run a daemon on my
system in order to use Earthly is a significant drawback!
First Exposure
The ghc-musl README
says to use the following command to “build and test all images, and
generate an updated README.md
:”
earthly --allow-privileged --artifact +all/README.md
This worked without error after fixing the issue described in ghc-musl Part 1. I wanted to try out the GHC 9.2.2 image, but the images were not retained! That command apparently builds and tests the images but then discards them. It actually took me hours of reading the documentation and experimenting before I finally understood the situation!
Initially, I was unsuccessful at altering the above command to also
retain the build images. I tried many things, from the
promising --image
option to things that are unlikely to
work but might give me a clue about what is going on. I figured that
using the --push
option could be used to push the built
images to a registry, but I wanted to test the images locally.
I noticed that the all
target has a FROM
command. I wondered if the images do not show up in my host Docker
environment because they are created in an isolated environment
(container). I tried refactoring the Earthfile
so that the
all
target does not use a FROM
command. I
started to get image output!
I started to draft a new issue to ask about the correct way to retain
images. While documenting my attempt, I ran across some documentation
that made me realize what I was doing wrong! The earthly command
documentation shows that there are different “forms” of the
earthly
command. This is also shown in the
--help
output.
USAGE:
earthly [options] <target-ref>
earthly [options] --image <target-ref>
earthly [options] --artifact <target-ref>/<artifact-path> [<dest-path>]
earthly [options] command [command options]
It looks like the --artifact
and --image
options implement separate commands! They are (sort of) options
because they can be specified more than once, but they are also (sort
of) commands. This CLI design has significant room for improvement! I
was getting image output with my refactored Earthfile
only
because I was testing it with a command that did not update the README!
Stashing my changes, I confirmed that the follow command provides image
output even with the original Earthfile
.
earthly --allow-privileged +all
Note that I still do not know if it is possible to output both images and artifacts with a single command. I have only had success with doing this using separate commands.
Next, I wanted to figure out how to build and test specific images. For example, a developer may need to add some dependencies in order to be able to build specific software. It would be convenient to be able to develop such changes using a single version of GHC and build all versions after it is working, if necessary. Building all versions in every iteration is wasteful of both time and resources.
I started to draft a new issue to ask the ghc-musl maintainer how
to do this, and to offer to help write documentation. As I drafted the
issue, however, I felt like I already knew the answer. Though I am new
to Earthly, those hours of trying to get images to output has given me a
better understanding of the software. I suspect that the
Earthfile
needs to be refactored in order to provide
separate targets for each image.
I refactored the Earthfile
to try it out, and it works.
I went ahead and made all improvements, given my current knowledge. I am
not sure if the ghc-musl maintainer will
like such a significant refactor, but it at least illustrates a number
of improvements.
My Refactored Earthfile
This section goes through the whole refactored
Earthfile
. I am writing it to document the lessons learned
while they are fresh in my mind. Code is shown before explanations about
that code. Note that the code does not include comments, but I will add
documentation in comments if the ghc-musl maintainer is
interested in a pull request.
VERSION 0.6
The VERSION command is currently optional but will be mandatory in a future version of Earthly.
ARG ALPINE_VERSION=3.15.1
FROM alpine:$ALPINE_VERSION
The current Earthfile
uses a FROM command in
the all
target, which is the only target used via the CLI.
This refactored Earthfile
has more than one target that is
used via the CLI, so a top-level FROM command
provides a common environment for all of them. The all
target builds other targets, and separate environments are not needed
for these delegated build commands.
The current Earthfile
specifies the Alpine version in many
locations. I do not know of a reason to use different versions for
different images, so the refactored Earthfile
specifies the
version in a single location. Note that I use the variable name
ALPINE_VERSION
instead of just ALPINE
for
clarity.
ARG GHC_MUSL_VERSION=23
#ARG BASE_TAG=utdemir/ghc-musl:v$GHC_MUSL_VERSION
ARG BASE_TAG=extremais/ghc-musl:v$GHC_MUSL_VERSION
I renamed the VERSION
variable to
GHC_MUSL_VERSION
to make it easily distinguishable from the
VERSION
command and ALPINE_VERSION
.
I am using a custom BASE_TAG
for my local tests, to
ensure that my test images to not overwrite the official images. I
implemented this in the Earthfile
for testing, but note
that the BASE_TAG
can be set when building using a
--build-arg
option.
The current Earthfile
uses the ENV
command to set the BASE_TAG
. This is inappropriate usage of
the ENV
command, which should only be used to set environment variables. The ARG command
should be used instead.
These variables are declared at the top-level, outside of any target.
Any change of a top-level variable causes the rebuild of any target
(even if the target does not make use of the changed variable). For this
reason, variables such as TEST_CABAL
and
TEST_STACK
are deliberately not put at the
top-level, as doing so would cause the base-system
target
to rebuild even when doing so is not necessary.
base-system:FROM alpine:$ALPINE_VERSION
RUN apk update \
&& apk add \
\
autoconf automake bash binutils-gold curl dpkg fakeroot file \
findutils g++ gcc git make perl shadow tar xz && apk add \
\
brotli brotli-static \
bzip2 bzip2-dev bzip2-static \
curl libcurl curl-static \
freetype freetype-dev freetype-static \
gmp-dev \
libffi libffi-dev \
libpng libpng-static \
ncurses-dev ncurses-static \
openssl-dev openssl-libs-static \
pcre pcre-dev \
pcre2 pcre2-dev \
sdl sdl-dev sdl-static \
sdl2 sdl2-dev \
sdl2_image sdl2_image-dev \
sdl2_mixer sdl2_mixer-dev \
sdl2_ttf sdl2_ttf-dev \
sdl_image sdl_image-dev \
sdl_mixer sdl_mixer-dev \
xz xz-dev \
zlib zlib-dev zlib-static && ln -s /usr/lib/libncursesw.so.6 /usr/lib/libtinfo.so.6
The base-system
target defines the common system layer
of all of the images. The FROM command
bases the layer on an Alpine image, and the RUN command runs
the necessary commands to initialize the layer. My changes from the
original Earthfile
are trivial.
- Each RUN command creates a separate layer, so it is generally a good practice to minimize them when used in images that are output. In this case, RUN commands should only be separated if a different command must be used in between them.
- I kept the separate
apk add
commands. The first installs general system software requirements, while the second installs requirements for building static executables with various dependencies. - I like to sort package name arguments so that it is easy to check if
a package is already in the list. Note that the arguments to the second
apk add
command are sorted by group, not individual package name. - Some packages are in both
apk add
commands. This does not really matter, so I kept them. - I changed the spacing to use less whitespace. Note that Earthly examples use two spaces for indentation.
ghc:FROM +base-system
ARG --required GHC
ENV GHCUP_INSTALL_BASE_PREFIX=/usr/local
RUN curl --fail --output /bin/ghcup \
'https://downloads.haskell.org/ghcup/x86_64-linux-ghcup' \
&& chmod 0755 /bin/ghcup \
&& ghcup upgrade --target /bin/ghcup \
&& ghcup install ghc "$GHC" --set \
&& ghcup install cabal --set \
&& /usr/local/.ghcup/bin/cabal update
ENV PATH="/usr/local/.ghcup/bin:$PATH"
The ghc
target installs GHCup, GHC, and Cabal, as well as updates the
Cabal index, on top of the base system layer. This is split into
multiple targets in the original Earthfile
, but there is no
need to do so. This single target minimizes layers and is easier to read
because it takes much less space on the screen. In general, I think that
every target should have a good reason for existing.
This target creates different layers for different versions of GHC, thanks to the ARG command.
I am not a fan of using GHCUP_INSTALL_BASE_PREFIX
and
/usr/local/.ghcup/bin
, but I kept it because it works. Note
that the use of the ENV
command in this target is used to set environment variables within the
container.
The original Earthfile
specifies a version of Cabal to
use per image. Each release of GHC has a minimum version of Cabal that
it works with, but it is fine (preferred) to use newer versions when
available. Installing the minimum version is good for testing
compatibility, but these images are for building executables, not
testing compatibility. I therefore removed Cabal version configuration
altogether, using the ghcup
command to always install the
latest version. I also use the --set
option instead of
separate set
commands.
The cabal update
command uses an absolute path so that
it can be included in the single RUN command
(minimizing layers) before the PATH
environment
variable is set.
The result of the ghc
target is the full image, but the
image is not saved by this target because it should be saved
after testing. If the image were saved before testing, then a broken
image would overwrite any existing images.
test-cabal:FROM +ghc
COPY example /example
WORKDIR /example/
RUN cabal new-build example --enable-executable-static
RUN file $(cabal list-bin example) | grep 'statically linked'
RUN echo test | $(cabal list-bin example) | grep 'Hello World!'
The test-cabal
target implements the Cabal tests using a built
image. It is based on the ghc
target, but any side effects
of the test are not included in output images.
The original Earthfile
comments out the tests because
the cabal list-bin
command is only available in recent
versions of Cabal. Since the refactored Earthfile
always
uses the most recent version of Cabal, these tests can be run. Note that
the execution test requires an echo
command to close
STDIN
since the example program reads from it.
I did not worry about combining RUN commands in tests because they are not included in output images.
test-stack:FROM +ghc
RUN ghcup install stack --set
COPY example /example
WORKDIR /example/
WITH DOCKER --load ghc-musl=+ghcRUN stack build \
--ghc-options '-static -optl-static -optl-pthread -fPIC' \
--docker --docker-image ghc-musl
ENDRUN file $(find /example/.stack-work/install/ -type f -name example) \
| grep 'statically linked'
RUN echo test \
| $(find /example/.stack-work/install/ -type f -name example) \
| grep 'Hello World!'
The test-stack
target implements the Stack tests using a built
image. It is based on the ghc
target, but any side effects
of the test are not included in output images.
The original Earthfile
uses an earthly/dind image,
which facilitates Docker-in-Docker
functionality. The WITH
DOCKER command enables running Docker commands within a Docker
container. I tried using the WITH
DOCKER command without earthly/dind and was
surprised to see that it works because Earthly automatically configures
Docker in that case. I kept it only because it allows me to use the
ghc
target and use GHCUp to install Stack.
I added some tests that are equivalent to the tests run in the
test-cabal
target.
image:FROM +ghc
ARG TEST_CABAL=1
ARG TEST_STACK=1
ARG --required TAG
"$TEST_CABAL" = "1" ]
IF [
BUILD +test-cabal
END"$TEST_STACK" = "1" ]
IF [
BUILD +test-stack
END"$TAG" SAVE IMAGE --push
The image
target only exists in order to run tests
before saving images.
The TEST_CABAL
and TEST_STACK
variables are
used to conditionally run the tests. Tests are run by default but can be
disabled using --build-arg
options. For example, the CI
cannot run the Stack tests due to the use of
--allow-privileged
, so the following command can be
used.
earthly --no-output --build-arg TEST_STACK=0 +all
Note that the SAVE IMAGE
command only actually pushes images when the --push
option
is specified in the earthly
command.
ghc922:
BUILD +image --GHC=9.2.2 --TAG=$BASE_TAG-ghc922
ghc902:
BUILD +image --GHC=9.0.2 --TAG=$BASE_TAG-ghc902
ghc8107: BUILD +image --GHC=8.10.7 --TAG=$BASE_TAG-ghc8107
Separate targets are provided for each supported GHC version. These
targets run in the container created by the top-level FROM command and
simply delegate to the image
target using BUILD commands,
passing along the required arguments.
For example, the following command can be used to build and test just the GHC 9.2.2 image.
earthly --allow-privileged +ghc922
Enabling this functionality was my primary goal of refactoring the
Earthfile
. It is easy to build and test a specified image
without building all of them!
readme:RUN apk add bash gettext
COPY ./update-readme.sh .
RUN ./update-readme.sh \
"$BASE_TAG-ghc922" \
"$BASE_TAG-ghc902" \
"$BASE_TAG-ghc8107"
SAVE ARTIFACT README.md
The readme
target implements the README update. It also
runs in the container created by the top-level FROM command.
The necessary software is installed, the script is copied, and the
script is run with the appropriate tags. Changing or adding a supported
GHC version requires making changes in more than one location, but I do
not think that the Earthly “language” is powerful enough to resolve this
issue. Perhaps it would be nice if it provided a way to load
configuration from a JSON or YAML file? (/me ducks
)
The script changes the README.md
file within the
container, and the SAVE
ARTIFACT command copies it to the host. By putting this in a
separate target, it is possible to run it separately from image building
and testing, as follows.
earthly --artifact +readme/README.md
Note that the all
target can be used in cases when
building and testing should be done before saving the artifact.
all:
BUILD +ghc922
BUILD +ghc902
BUILD +ghc8107 BUILD +readme
The all
target simply does everything by delegating to
the various targets using the BUILD
command.
Reflection
That was an interesting first experience of using Earthly. It has
some really nice features as well as a lot of room for improvement. I am
definitely not interested in using it to replace my
Makefile
s. I may consider using it for building Docker
images, however. It seems to be a pretty good fit for the ghc-musl project. I think
that the most convenient feature is how it manages intermediate/internal
images.