Nix Terminfo and Locale Archive
While trying to use the latest version of Obelisk, I
ran into “invalid character” errors when attempting to use the
--verbose
option. I traced the source of the error on the
Haskell side to printing of Unicode characters to the screen, and
replacing those characters with ASCII characters allowed me to get past
the error. The issue
is not updated yet, but a friend relayed some information about the
cause as well as some workarounds. This blog entry gives my
interpretation of the issue, so any errors are my own.
Unless using NixOS, the system contains things from both the host OS and Nix. Nix processes inherit environment variables by default, but these environment variables may reference things that are not configured/installed under Nix. Many issues can be avoided by separating the host OS and Nix, but some things are difficult to separate. In particular, a terminal provided by the host is often used to run Nix commands, and configuration for that user interface may not match the Nix environment.
I first ran into this kind of issue when I was unable to use the
backspace when using nix-shell
. The problem was that Nix
did not have terminfo entries for
the terminal that I was using. I initially got around that issue by
setting TERM=xterm
when using Nix, every time I ran
nix-shell
. A friend later taught me a different workaround
that is less annoying. The terminfo
database is located at
/usr/lib/terminfo
, but the software also checks
~/.terminfo
! By copying the necessary files from
/usr/lib/terminfo
to ~/.terminfo
, Nix
processes can find them.
Why not install these terminfo
files in the Nix
environment? The files that I need are specific to the terminal that I
use, while other developers likely use different terminals. Perhaps
all terminfo
files could be installed in the Nix
environment, but that increases the size of the closure. It would also
be unsatisfactory to include OS-level stuff that is unrelated to the
software that a derivation provides.
The “invalid character” errors are caused by a similar issue. My
locale environment variables were set to en_US.UTF-8
, but
that locale was not available within the Nix environment. One workaround
is to use the C.UTF-8
locale instead for both the
LANG
and LC_CTYPE
environment variables. This
is similar to how I set the TERM
environment variable to
work around terminal issues. I tested this and confirmed that this does
indeed work around the issue, and I saw Unicode characters in the
output.
Another way to work around the issue is to use the LOCALE_ARCHIVE
environment variable. The documentation indicates that it exists to
provide a way to specify a locale archive when using different versions
of glibc
, but it can also be used to point to a locale
archive on the host system, usually located at
/usr/lib/locale/locale-archive
. I tested this and confirmed
that it works as well.
What can the Obelisk project do to address the issue? It is a general
Nix issue, so perhaps it is best to just document it in the README or
FAQ. Alternatively, the project could include pkgs.development.libraries.glibc.locales
with allLocales
set to true
so that all
locales are installed. This is analogous to installing
terminfo
, and the documentation indicates that it takes
about 100MB of space.
If locale information is not needed, however, a hack that would be
transparent to users would be to make the ob
program set
the locale environment variables to C.UTF-8
. This cannot
simply be done when the program starts, however, because the environment
variables are read before the Haskell code is run. The program can set
them before “hand off” and re-execute the program when
--no-handoff
is used, however.
This hack is easy to implement, so I tried it out. The commit is on GitHub, but I will not link to it because this blog entry will outlive the repository branch. I shall instead include the code here.
The setLocaleC
function checks locale environment
variables and sets them to C.UTF-8
when necessary,
returning True
if any were set.
setLocaleC :: MonadObelisk m => m Bool
= fmap or . forM envVars $ \envVar -> do
setLocaleC <- liftIO $ (== Just locale) <$> lookupEnv envVar
alreadySet $ do
unless alreadySet Debug $ T.pack $ unwords ["Setting", envVar, "to", locale]
putLog $ setEnv envVar locale
liftIO pure $ not alreadySet
where
= ["LANG", "LC_CTYPE"]
envVars = "C.UTF-8" locale
The reExecuteOb
function re-executes the ob
command.
reExecuteOb :: MonadObelisk m => FilePath -> [String] -> m ()
= do
reExecuteOb obPath myArgs Debug "Re-executing..."
putLog $ liftIO $ rawSystem obPath myArgs void
In the main'
function, setLocaleC
is
called, binding the result to localeSet
. In the “hand off”
case, nothing more needs to be done. In the --no-handoff
case, reExecuteOb
is called if the locale was set.
if localeSet
then reExecuteOb obPath myArgs
else ob $ _args_command args'
It works as expected.