Haskell Monorepo GitHub Actions (Part 1)

I am using GitHub Actions to run automated tests for my released projects. All of the personal projects that I have released so far have a single package per Git repository. I will soon need to use GitHub Actions to run tests for multiple Haskell packages in a single repository (aka “monorepo”), however. This blog entry discusses my initial thoughts about this.

My currently released personal projects all run tests when there is a push to the develop or main branches, as well as on pull requests. The Haskell projects run tests with the following configurations.

Tests are run using Cabal, for all supported GHC versions
Tests are run using Stack, for all supported GHC versions
Tests are run for all supported Cabal versions
Tests are run against the lower bounds of dependencies
Tests are run against the upper bounds of dependencies

Since one of my goals for these projects is to support a wide range of dependency versions, each project has quite a few tests. For example, TTC currently supports 11 major GHC versions and 8 major Cabal versions, resulting in a total of 32 test jobs per push.

One additional note is that my GitHub Actions configuration for Haskell projects is mostly the same for all projects. I try to factor out project-specific information that can be loaded by a config job in order to decrease the maintenance cost. There is still room for improvement. This is one of the things that I will think about while developing the new configuration.

As I wrote in my recent status report, I am developing a new design of TTC that I plan to release in separate packages. The core library supports the same GHC versions as TTC, while companion libraries provide support for new types that have narrower compatibility. I currently plan to manage all of these packages within the ttc-haskell repository, making it easy to coordinate. In addition, I have some more projects with more than one package per repository that I hope to release someday (soon).

I need to determine how to configure GitHub Actions to run automated tests for all of the packages in the repository. The naïve approach would be to run completely separate tests for each package, but this would result in excessive use of resources, in terms of monetary costs (to GitHub), carbon footprint, and time. I would like to avoid this.

I was unable to find any blog entries about using GitHub Actions with a Haskell monorepo. Searching for examples, I checked out various Haskell projects with more than one package in a single repository. Most of them are not helpful. For example, a number of them use Nix to test only the exact versions of dependencies configured in the flake, which is the opposite of my goals.

The following projects are the most helpful as inspiration for me. All of these projects test all packages in each test job, so there is no excessive use of resources.

haskell-servant/servant has a very straightforward configuration with a single job matrix that runs tests using Cabal on Linux for various GHC versions. In each job, all projects are built and tested.
yesodweb/wai has a very straightforward configuration with a single job matrix that runs tests using Stack on various OSes for nightly and various LTS snapshots. In each job, all projects are built and tested, and documentation is built.
haskell/haskell-language-server has a more involved configuration that runs tests using Cabal on various OSes for various GHC versions. In each job, all projects are built and tested when appropriate, using conditionals such as if: matrix.test && matrix.ghc != '9.10'.
simonmichael/hledger as a more involved configuration that runs tests using Stack on Linux with the single version of GHC that is on the Ubuntu container. Perhaps this is done so that the tests run quickly. In each job, all projects are built and tested.

Researching the topic in general, I learned that GitHub Actions allows you to filter by path (documentation). This makes it possible to run separate CI jobs depending on which packages have been changed. For example, perhaps configuration like the following would be appropriate in a .github/workflows/ci-ttc.yml configuration that runs tests for the ttc package.

on:
  push:
    branches:
      - develop
      - main
    paths:
      - 'ttc/**'
      - '.github/workflows/ci-ttc.yml'

This functionality helps avoid running tests for unchanged packages. Changes across many packages would trigger separate tests for all of those packages, however. For example, adding support for a new GHC minor release is often done across all packages. One would either need to use many separate caches, perhaps exceeding the storage limit, or configure sequencing in order to share caches without conflict. Perhaps this is most useful when testing monorepos that contain largely unrelated packages that are essentially separate projects.

I found a feature request to allow workflow configuration in sub-folders that has a lot of interesting discussion. I learned that GitHub Actions does not support symbolic links, so many people manage separate workflows in subdirectories that they copy to the .github/workflows directory in the project root to enable them. A project called Hawk provides a CLI to make such management easy.

I will now take some time to “digest” this information. I am currently leaning toward minimizing the number of jobs and making steps conditional when necessary. I would like to avoid project-specific configuration. I know how to do so by putting conditionals into scripts, but I would like to continue to have meaningful (fine-grained) tasks in the GitHub Actions UI.

A related change that I am considering is removing the tests of the supported Cabal versions, aside from the oldest. Other tests use the latest version, and perhaps just testing the oldest supported is sufficient.

Author

Travis Cardwell

Published

July 5, 2024