Skip to main content

Bash Escaping Issue

I finished updating the new implementation of Base and am now working on the tests. Through testing, I found an issue with the code that copies the configuration of the current shell to a new shell, as discussed in Running Base in a New Shell and Rewrite When More Than N Lines.

The Issue

When in a base environment, the value of the BASE environment variable is set to the base directory. Echoing that environment variable mangled the string by replacing the “t” character with a space, unless it was quoted! It only happened in one of the three ways that a base environment can be configured: the one that copies the environment configuration.

docker@basetest:~$ mkdir test
docker@basetest:~$ cd test/
docker@basetest:~/test$ . base
[test] $ declare -p BASE
declare -- BASE="/home/docker/test"
[test] $ echo ${BASE} "  " "${BASE}"
/home/docker/ es    /home/docker/test
[test] $ exit
exit

Since it only happened in the copied environment, I figured it must be due to an environment issue. Investigating, I found that the value of the IFS environment variable (\t\n) was not unescaped correctly, so the “t” and “n” characters were therefore being interpreted as internal field separators.

The Fix

In order to set an environment variable with a value that includes a newline, the value must be quoted differently. For example, a three-character value of a space, a tab, and a newline can be quoted as $' \t\n'. Since tabs are fine as is, the new implementation only escapes newlines.

The new version of the demonstration script, available in full on GitHub at demo.sh, is shown below. Aside from the removal of tab escaping on line 19, the change is in the else branch.

_demo_load_env () {
  local defcmd defn line quoteflag value var
  while IFS=$'\n' read -r line ; do
    if [[ "${line}" =~ ^declare ]] ; then
      if [ -n "${defn}" ] ; then
        echo "${defn}"
        defn=""
      fi
      defcmd="${line#declare -* }"
      var="${defcmd%%=*}"
      case "${var}" in
        BASH_* | FUNCNAME | GROUPS | cmd | val )
          ;;
        DEMO_ENV | defcmd | defn | line | var )
          ;;
        * )
          defn="${line}"
          ;;
      esac
    else
      if [[ "${defn}" =~ ^[^=]*=\$ ]] ; then
        if [ "${quoteflag}" -eq 1 ] ; then
          defn="${defn%"'"}\""
        fi
      else
        value="${defn#*'"'}"
        defn="${defn%%=*}=\$'${value//\\/\\\\}"
      fi
      if [ "${line: -1}" == "\"" ] ; then
        line="${line%?}'"
        quoteflag=1
      else
        quoteflag=0
      fi
      defn="${defn}\\n${line//\\/\\\\}"
    fi
  done < <(declare -p)
  test -z "${defn}" || echo "${defn}"
  alias -p
}

The final character of the definition needs to be transformed from a double quote to a single quote, so a quoteflag variable is used to track the transformation. If ${defn} has already been transformed from double-quotes to escape-quotes, then the trailing quote is transformed back to a double quote if it had been transformed in the previous iteration of the loop, as it must be a double quote within the value if there is an additional line in the value. Otherwise, the opening double quote of ${defn} is converted to $' and any existing backslashes are escaped. The last character of the ${line} value is transformed to a single quote if it is a double quote, and quoteflag is set accordingly. ${defn} is then redefined as the concatenation of ${defn}, an escaped newline, and the possibly transformed ${line} with existing backslashes escaped.

Note that this change does not resolve the incorrect parsing of environment variable values with declare syntax after a newline.

A number of manual tests work without issue, but this mess of ugly code needs many automated tests to increase confidence that it works as expected.

Update

This demonstration script does indeed have more escaping issues! The issues and fixes are discussed in the following blog entries:

The final version of the script can be found in Abort Transformation!