Obelisk Memory Limiting
As described in the Laptop Issues blog entry, I recently upgraded my laptop to 48GB of RAM because my work requires a lot and my system would often freeze with only 16GB of RAM. I figured that I would no longer have memory issues, and I have indeed been enjoying having ample memory, but I ran out of memory this morning! I really do not want to crash my whole session when this happens, so I decided to start limiting RAM usage using cgroups.
When I run out of memory, the culprit is always the same. The project
uses Obelisk,
part of the Reflex FRP ecosystem.
I run ob shell
to enter a shell that has the project
dependencies, courtesy of Nix. Running
ob run
within that environment runs the project in
development mode. It uses ghcid to build
and run the project, so that it automatically reloads whenever there is
a change to a source file. It displays warnings and errors, so you can
keep an eye on it while you develop and test. Compilation is done using
GHC, the Haskell
compiler. The ghc
process consumes all of the memory.
Usually, the kernel “OOM killer” is able to catch the issue,
kill all of my processes, and dump me to a console login screen.
The goal is to contain the Obelisk environment using
cgroups
, setting memory limits so that the OOM killer just
kills those processes instead of my whole session.
I use a cgroup
named obelisk
:
CGROUP_NAME="obelisk"
Write access to the cgroup.procs
file of the parent
cgroup
is required in order to execute the initial
cgroup
process as a normal user (not root
), so
I create the new cgroup
under my user slice.
USER_SLICE_PATH="user.slice/user-${UID}.slice"
USER_SLICE_DIR="/sys/fs/cgroup/${USER_SLICE_PATH}"
USER_SLICE_PROCS_FILE="${USER_SLICE_DIR}/cgroup.procs"
CGROUP_PATH="${USER_SLICE_PATH}/${CGROUP_NAME}"
CGROUP_DIR="/sys/fs/cgroup/${CGROUP_PATH}"
Administrative access is required to create the new
cgroup
, but my normal user is granted permissions to
administer it.
echo "Creating cgroup ${CGROUP_PATH}..."
sudo cgcreate -a "${USER}" -t "${USER}" -g "memory:${CGROUP_PATH}"
Administrative access is also required to change ownership of the
cgroup.procs
file of the parent cgroup
.
echo "Setting owner of parent cgroup.procs file..."
sudo chown "${USER}" "${USER_SLICE_PROCS_FILE}"
The memory limits are set as follows. The memory.high
setting sets the memory usage throttle limit. If/when the
cgroup
exceeds the configured threshold, the kernel
throttles the processes and puts them under heavy reclaim pressure. The
memory.high
setting sets the maximum memory usage. If/when
the cgroup
hits this limit, the OOM killer kills the
cgroup
processes.
echo -n "Setting memory limit (soft): "
echo "32G" | tee "${CGROUP_DIR}/memory.high"
echo -n "Setting memory limit (hard): "
echo "36G" | tee "${CGROUP_DIR}/memory.max"
echo -n "Setting swap limit (soft): "
echo "128M" | tee "${CGROUP_DIR}/memory.swap.high"
echo -n "Setting swap limit (hard): "
echo "256M" | tee "${CGROUP_DIR}/memory.swap.max"
In my project directory, I can run a bash
shell in the
new cgroup
, as my normal user.
$ cgexec -g memory:user.slice/user-1331.slice/obelisk bash
The systemctl status
command can be used to show the
cgroup
hierarchy. Since I run bash
as the
initial process, my hierarchy looks like the following:
/
├─init.scope
│ └─1 /sbin/init
├─system.slice
│ ├─...
│ ...
└─user.slice
└─user-1331.slice
├─obelisk
│ └─417624 bash
├─...
...
Within that shell, I can run ob shell
and
ob run
as usual. With everything running, my hierarchy
looks like the following:
/
├─init.scope
│ └─1 /sbin/init
├─system.slice
│ ├─...
│ ...
└─user.slice
└─user-1331.slice
├─obelisk
│ ├─417624 bash
│ ├─435360 /home/tcard/.nix-profile/bin/ob shell
│ ├─443578 ./.obelisk/impl/.attr-cache/command.out/bin/ob --no-handoff shell
│ ├─443606 bash --rcfile /tmp/nix-shell-443606-0/rc
│ ├─443651 /home/tcard/.nix-profile/bin/ob run
│ ├─451869 ./.obelisk/impl/.attr-cache/command.out/bin/ob --no-handoff run
│ ├─452114 bash /run/user/1331/nix-shell-452114-0/rc
│ ├─452147 /nix/store/XXXXXXXX-ghcid-0.8/bin/ghcid ...
│ └─452152 /nix/store/XXXXXXXX-ghc-8.6.5/lib/ghc-8.6.5/bin/ghc ...
├─...
...
The problematic ghc
process is run within the new
cgroup
, so it should now only crash that
cgroup
instead of my whole session if/when it consumes
excess memory.
References: