Obelisk Memory Limiting
As described in the Laptop Issues blog entry, I recently upgraded my laptop to 48GB of RAM because my work requires a lot and my system would often freeze with only 16GB of RAM. I figured that I would no longer have memory issues, and I have indeed been enjoying having ample memory, but I ran out of memory this morning! I really do not want to crash my whole session when this happens, so I decided to start limiting RAM usage using cgroups.
When I run out of memory, the culprit is always the same. The project
uses Obelisk,
part of the Reflex FRP ecosystem.
I run ob shell to enter a shell that has the project
dependencies, courtesy of Nix. Running
ob run within that environment runs the project in
development mode. It uses ghcid to build
and run the project, so that it automatically reloads whenever there is
a change to a source file. It displays warnings and errors, so you can
keep an eye on it while you develop and test. Compilation is done using
GHC, the Haskell
compiler. The ghc process consumes all of the memory.
Usually, the kernel “OOM killer” is able to catch the issue,
kill all of my processes, and dump me to a console login screen.
The goal is to contain the Obelisk environment using
cgroups, setting memory limits so that the OOM killer just
kills those processes instead of my whole session.
I use a cgroup named obelisk:
CGROUP_NAME="obelisk"Write access to the cgroup.procs file of the parent
cgroup is required in order to execute the initial
cgroup process as a normal user (not root), so
I create the new cgroup under my user slice.
USER_SLICE_PATH="user.slice/user-${UID}.slice"
USER_SLICE_DIR="/sys/fs/cgroup/${USER_SLICE_PATH}"
USER_SLICE_PROCS_FILE="${USER_SLICE_DIR}/cgroup.procs"
CGROUP_PATH="${USER_SLICE_PATH}/${CGROUP_NAME}"
CGROUP_DIR="/sys/fs/cgroup/${CGROUP_PATH}"Administrative access is required to create the new
cgroup, but my normal user is granted permissions to
administer it.
echo "Creating cgroup ${CGROUP_PATH}..."
sudo cgcreate -a "${USER}" -t "${USER}" -g "memory:${CGROUP_PATH}"Administrative access is also required to change ownership of the
cgroup.procs file of the parent cgroup.
echo "Setting owner of parent cgroup.procs file..."
sudo chown "${USER}" "${USER_SLICE_PROCS_FILE}"The memory limits are set as follows. The memory.high
setting sets the memory usage throttle limit. If/when the
cgroup exceeds the configured threshold, the kernel
throttles the processes and puts them under heavy reclaim pressure. The
memory.high setting sets the maximum memory usage. If/when
the cgroup hits this limit, the OOM killer kills the
cgroup processes.
echo -n "Setting memory limit (soft): "
echo "32G" | tee "${CGROUP_DIR}/memory.high"
echo -n "Setting memory limit (hard): "
echo "36G" | tee "${CGROUP_DIR}/memory.max"
echo -n "Setting swap limit (soft): "
echo "128M" | tee "${CGROUP_DIR}/memory.swap.high"
echo -n "Setting swap limit (hard): "
echo "256M" | tee "${CGROUP_DIR}/memory.swap.max"In my project directory, I can run a bash shell in the
new cgroup, as my normal user.
$ cgexec -g memory:user.slice/user-1331.slice/obelisk bash
The systemctl status command can be used to show the
cgroup hierarchy. Since I run bash as the
initial process, my hierarchy looks like the following:
/
├─init.scope
│ └─1 /sbin/init
├─system.slice
│ ├─...
│ ...
└─user.slice
└─user-1331.slice
├─obelisk
│ └─417624 bash
├─...
...
Within that shell, I can run ob shell and
ob run as usual. With everything running, my hierarchy
looks like the following:
/
├─init.scope
│ └─1 /sbin/init
├─system.slice
│ ├─...
│ ...
└─user.slice
└─user-1331.slice
├─obelisk
│ ├─417624 bash
│ ├─435360 /home/tcard/.nix-profile/bin/ob shell
│ ├─443578 ./.obelisk/impl/.attr-cache/command.out/bin/ob --no-handoff shell
│ ├─443606 bash --rcfile /tmp/nix-shell-443606-0/rc
│ ├─443651 /home/tcard/.nix-profile/bin/ob run
│ ├─451869 ./.obelisk/impl/.attr-cache/command.out/bin/ob --no-handoff run
│ ├─452114 bash /run/user/1331/nix-shell-452114-0/rc
│ ├─452147 /nix/store/XXXXXXXX-ghcid-0.8/bin/ghcid ...
│ └─452152 /nix/store/XXXXXXXX-ghc-8.6.5/lib/ghc-8.6.5/bin/ghc ...
├─...
...
The problematic ghc process is run within the new
cgroup, so it should now only crash that
cgroup instead of my whole session if/when it consumes
excess memory.
References: