Skip to main content

Abort Transformation!

Over the weekend, I had a chance to think about Bash escaping during a bath. I have been anxious to release Base so that I can move on to more important (as well as much more enjoyable) tasks, but I am not satisfied with the current implementation of the environment copying functionality. The issue with environment variable values that have declare syntax following a newline is a clear indicator that it is not implemented well, and I suspect that the transformation between string types does not cover all possibilities.

After the bath, I decided to try implementing a double-quoted string parser in Bash. I realized that I also need to handle newlines in strings that are stored in arrays, making the code even more complex! I made good progress, but the result was many hundreds of lines of difficult-to-read code! I was considering writing an implementation in Haskell that I could test well, using unit tests as well as QuickCheck, and then translate to Bash…

I then found information about double-quoted strings in the Bash manual, and there are indeed other characters that are treated specially! Here are some cases that I was not handling correctly:

$ declare TEST='`'
$ declare -p TEST
declare -- TEST="\`"
$ declare TEST='$'
$ declare -p TEST
declare -- TEST="\$"

The manual also mentions possible escaping of !, depending on history expansion settings! Transformation of strings is much more difficult that I had expected. Thinking about possible ways to get around the issue, I realized that there is a much better way that does not involve transformation of escape sequences at all!

In Bash, quoted strings that are written together are concatenated. In the following example, the string is written using three quoted parts: two single-quoted parts, with a double-quoted part in the middle.

$ echo 'Alpha is Greek for "doesn'"'"'t work."'
Alpha is Greek for "doesn't work."

The output of declare -p formats all values using double quotes. When there is a newline, one can simply insert it using escape-quotes without having to transform the rest of the string! It seems obvious in hindsight.

New Implementation

In order to avoid hundreds of lines of code that parses strings, I decided to go back to running declare -p per environment variable. Previously, a major issue in doing this was the use of pipes, grep, and sed, which spawned many processes. That issue can be avoided by parsing the variable names in Bash; declare itself is a Bash builtin.

Function _demo_select_env parses environment variable names from the output of declare -p. Some environment variables that should not be copied are filtered out.

_demo_select_env () {
  local defn line var
  while IFS=$'\n' read -r line ; do
    if [[ "${line}" =~ ^declare\ - ]] ; then
      defn="${line#declare -* }"
      var="${defn%%=*}"
      case "${var}" in
        BASH_* | FUNCNAME | GROUPS | cmd | val  | \
        DEMO_ENV | decl | defn | line | var )
          ;;
        * )
          echo "${var}"
          ;;
      esac
    fi
  done < <(declare -p)
}

Note that this function outputs variable names that occur in declare syntax following a newline in the value of an environment variable. This causes two issues:

  • Some variable names that are output may not actually exist. This issue is resolved by simply ignoring the variables that do not exist.
  • Some variable names may be output more than once. This issue is resolved by filtering the list through sort -u.

Function _demo_load_env becomes quite simple! Each variable name is queried, and multiple lines are joined as described above. Only declare commands for valid environment variables are output. Finally, alias commands are also output.

_demo_load_env () {
  local decl line var
  while IFS=$'\n' read -r var ; do
    decl=""
    while IFS=$'\n' read -r line ; do
      if [ -z "${decl}" ] ; then
        decl="${line}"
      else
        decl="${decl}\"\$'\\n'\"${line}"
      fi
    done < <(declare -p "${var}" 2>/dev/null)
    [ -z "${decl}" ] || echo "${decl}"
  done < <(_demo_select_env | sort -u)
  alias -p
}

The updated demonstration script is available on GitHub: demo.sh