Render and Parse

Series: TTC (Textual Type Classes)

Static type checking is a characteristic feature of the Haskell programming language, and types play an important role in the design of Haskell software. Distinct types, created using data or newtype keywords, have a number of benefits over general-purpose types. They can help developers avoid passing the wrong value to a function, they provide more “documentation” than general-purpose types, and they can greatly aid in refactoring. They can also define some boundaries within an API, making large software easier to understand.

Many data types have a natural textual representation, and one often needs to implement a function that “renders” a value to text. For example, the text may be displayed to the user, used in an API call, inserted into a database, or included in a log message. One also often needs to implement a function that “parses” text to a value. For example, the text may be user input, the response of an API call, or the result of a database lookup.

Haskell has many textual data types, described in the first article of this series: Textual Type Class. In most cases, each distinct data type renders/parses to a specific textual data type, and that type can be converted to any other textual data type that may be required. For example, an API call might require a Text value, a database insert might require a ByteString value, and a log message might require a String value.

The TTC (Textual Type Classes) library provides a way to manage the rendering and parsing of data types to multiple textual data types without unnecessary conversion between textual data types and without having to remember which textual data type is best used with which data type. This article focuses on the Render and Parse type classes.

As a minimal example, this article makes use of the following type. Name is defined using newtype with an underlying type of Text.

module Example.Name where

import Data.Text (Text)

newtype Name = Name Text

`Show` and `Read`

A good way to introduce the motivation for the Render and Parse type classes is to consider the problems with using the Show and Read type classes that are defined in the base package. Both the Show and Render type classes are used to render values as text, and both the Read and Parse type classes are used to parse values from text.

Intended Use

The Show and Read type classes are intended to work with text that contains Haskell source code. A Show instance should render a string of valid Haskell code that can recreate the type when parsed using a Read instance. This can be useful when debugging, but Haskell code is rarely the desired representation for passing a value to an API or storing it in a database.

There are three basic ways to create Show and Read instances for the example Name type. One way is to derive the instances as follows.

module Example.Name where

import Data.Text (Text)

newtype Name = Name Text
  deriving (Read, Show)

A value of this type is created using the Name constructor, and the derived instances reflect this.

λ: show (Name "Bill")
"Name \"Bill\""
it :: String
λ: print (Name "Bill")
Name "Bill"
it :: ()
λ: read "Name \"Bill\"" :: Name
Name "Bill"
it :: Name

Another way to create Show and Read instances is to use some extensions to derive the instances based on the wrapped type.

{-# LANGUAGE DerivingStrategies #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}

module Example.Name where

import Data.Text (Text)

newtype Name = Name Text
  deriving newtype (Read, Show)

The derived instances no longer use strings prefixed with the constructor. Since there is no IsString instance, note that this definition deviates from the intended usage of Show and Read because the rendered strings are not valid Haskell code for creating a value.

λ: show (Name "Bill")
"\"Bill\""
it :: String
λ: print (Name "Bill")
"Bill"
it :: ()
λ: read "\"Bill\"" :: Name
"Bill"
it :: Name

Note that the rendered strings are quoted. This involves escaping in cases where the rendered value includes quotation marks.

It is quite common for developers to use Show and Read instances for their own purposes. For example, a developer may want to render a Name as the name itself, without quotes. This can be done by writing the instances by hand.

module Example.Name where

import qualified Data.Text as T
import Data.Text (Text)

newtype Name = Name Text

instance Read Name where
  readsPrec _ s = [(Name (T.pack s), "")]

instance Show Name where
  show (Name name) = T.unpack name

Such instances deviate even more from the intended usage of Show and Read, as the rendered strings are not valid Haskell code.

λ: show (Name "Bill")
"Bill"
it :: String
λ: print (Name "Bill")
Bill
it :: ()
λ: read "Bill" :: Name
Bill
it :: Name

The use of Show instances is a common source of bugs. It is easy to derive an instance, and one often does so when prototyping new software. Use of the derived instance with an API that does not expect Haskell code can then cause issues.

As an example of such a bug in production software, a derived Show instance was used to render a string for a value in a CSV file. The rendered string included quotation marks in the middle of the string (like Name "Bill"), which caused the CSV file to be invalid. The invalid CSV file resulted in other failures, and the bugs persisted for over two years before the use of the Show instance was identified and corrected.

`String` Usage

Another problem with using Show and Read instances is that doing so utilizes the String type. Since String is a type alias for a list of characters, it has poor performance.

In the example, the Name type has an underlying value of type Text. To render a Text value to pass to an API, one could use code like the following:

callApi $ T.pack (show name)

Sometimes people will even create a helper function for this because it is used so often.

showT :: Show a => a -> Text
showT = T.pack . show

The String type is used only because of the use of Show.

`Read` Errors

The use of Read to parse introduces another issue: it does not provide support for detailed error messages. The error message provided by the readEither function (defined in Text.Read) is almost completely unhelpful:

λ: import Text.Read (readEither)
λ: readEither "Test" :: Either String Name
Left "Prelude.read: no parse"

It can be frustrating when you run into such an error message while starting to work with a large project that you are unfamiliar with, as it gives no clues about what is not parsing. When using simple types, one generally provides a better error message when parsing:

λ: import Text.Read (readMaybe)
λ: maybe (Left "invalid Name") Right (readMaybe "Test" :: Maybe Name)
Left "invalid Name"

When parsing complex types, however, it is very helpful to use error messages that provide details about why parsing fails. Parsing a Name could fail due to the use of an invalid character (such as null) or due to length limitations. The Read type class does not allow for this.

Type-Specific Functions

Instead of using Show and Read, one can write functions to render and parse each type. In the following example, mkName is a smart constructor that is used to ensure that a Name does not contain invalid characters, and toText renders a Name as Text by simply unwrapping the underlying value.

module Example.Name
  ( Name
  , mkName
  , toText
  ) where

import Data.Char (isPrint)
import qualified Data.Text as T
import Data.Text (Text)

newtype Name = Name Text

mkName :: Text -> Either String Name
mkName t
    | T.all isPrint t = Right $ Name t
    | otherwise       = Left "invalid Name: contains invalid characters"

toText :: Name -> Text
toText (Name name) = name

Since the underlying type is Text, the smart constructor parses Text and the toText function renders Text. In cases where other textual types are required, explicit conversion to/from Text is necessary. This can be done in several ways, and use of the Textual Type Class provides a way with little boilerplate. The conversion can be done at the location that an API function is called, or it could be included in the API, as follows:

module Example.Name
  ( Name
  , mkName
  , toText
  ) where

import Control.Monad (unless)
import Data.Char (isPrint)
import qualified Data.Text as T
import Data.Text (Text)
import qualified Data.TTC as TTC

newtype Name = Name Text

mkName :: TTC.Textual t => t -> Either String Name
mkName = TTC.asT $ \t -> do
    unless (T.all isPrint t) $
      Left "invalid Name: contains invalid characters"
    pure $ Name t

toText :: TTC.Textual t => Name -> t
toText (Name name) = TTC.fromT name

The Render and Parse type classes use this approach. The use of type classes provides an interface that is consistent as well as easy to use.

`Render`

The Render type class renders a data type as a textual data type.

class Render a where
  render :: Textual t => a -> t

Instances may use the textual data type that is most natural. The return value of the render function uses the Textual type class to automatically convert to any Textual type. In cases where the type is ambiguous, one of the following functions may be used:

renderS   :: Render a => a -> String
renderT   :: Render a => a -> T.Text
renderTL  :: Render a => a -> TL.Text
renderTLB :: Render a => a -> TLB.Builder
renderBS  :: Render a => a -> BS.ByteString
renderBSL :: Render a => a -> BSL.ByteString
renderBSB :: Render a => a -> BSB.Builder
renderSBS :: Render a => a -> SBS.ShortByteString

A Render instance for the example Name type may be written as follows:

module Example.Name
  ( Name
  ) where

import Data.Text (Text)
import qualified Data.TTC as TTC

newtype Name = Name Text

instance TTC.Render Name where
  render (Name name) = TTC.fromT name

The TTC.render function can then be used to render a Name value to a String to print to the screen as well as a Text value to pass to some API call:

putStrLn $ "calling API for " ++ TTC.render name
callSomeApi $ TTC.render name

Since the underlying value of Name is Text, that second call to TTC.convert is evaluated to id and is optimized away. The developer can use TTC.convert without having to worry about such details.

While Show instances should produce valid Haskell code, Render instances are intended to be used in any way that the developer needs. The name of the type class was selected to indicate that it is not necessarily isomorphic: the rendered text does not need to contain all of the information required to recreate the original value.

The Data.TTC module does not include any Render instances, allowing developers to define instances as required for each use case. For example, one may want to define a Render instance for the Int type that formats the integer in a specific way.

Default instances for some basic data types are defined for the RenderDefault type class, and you can load the Render instance as follows:

instance TTC.Render Int

Since a type may have at most one instance of a given type class, special care must be taken when defining type class instances in a shared library. In particular, orphan instances should generally not be used in shared libraries since they prevent users of the libraries from writing their own instances.

`Render` Utility Functions

Some libraries use Show instances as the only way to render the provided types. The following function is available for working with such types:

renderWithShow :: (Show a, Textual t) => a -> t

`Parse`

The Parse type class parses a data type from a textual data type.

class Parse a where
  parse :: (Textual t, Textual e) => t -> Either e a

Instances may use the textual data type that is most natural, using one of the Textual “as” functions for conversion. In cases when the type is ambiguous, one of the following functions may be used:

parseS   :: (Parse a, Textual e) => String              -> Either e a
parseT   :: (Parse a, Textual e) => T.Text              -> Either e a
parseTL  :: (Parse a, Textual e) => TL.Text             -> Either e a
parseTLB :: (Parse a, Textual e) => TLB.Builder         -> Either e a
parseBS  :: (Parse a, Textual e) => BS.ByteString       -> Either e a
parseBSL :: (Parse a, Textual e) => BSL.ByteString      -> Either e a
parseBSB :: (Parse a, Textual e) => BSB.Builder         -> Either e a
parseSBS :: (Parse a, Textual e) => SBS.ShortByteString -> Either e a

A Parse instance for the example Name type may be written as follows:

module Example.Name
  ( Name
  ) where

import Control.Monad (unless)
import Data.Char (isPrint)
import qualified Data.Text as T
import Data.Text (Text)
import qualified Data.TTC as TTC

newtype Name = Name Text

instance TTC.Parse Name where
  parse = TTC.asT $ \t -> do
    unless (T.all isPrint t) . Left $
      TTC.fromS "invalid Name: contains invalid characters"
    pure $ Name t

Note that the only difference with the implementation of the smart constructor in the Type-Specific Functions example above is that the error message is also Textual.

While Read instances should parse valid Haskell code, Parse instances are intended to be used in any way that the developer needs. The Data.TTC module does not include any Parse instances, allowing developers to define instances as required for each use case.

Default instances for some basic data types are defined for the ParseDefault type class, and you can load the Parse instance as follows:

instance TTC.Parse Int

`Parse` Auxiliary Functions

TTC provides some auxiliary functions for common use cases. The parseMaybe functions are used to parse text when the error message is not needed:

parseMaybe :: Parse a => Textual t => t -> Maybe a

parseMaybeS   :: Parse a => String              -> Maybe a
parseMaybeT   :: Parse a => T.Text              -> Maybe a
parseMaybeTL  :: Parse a => TL.Text             -> Maybe a
parseMaybeTLB :: Parse a => TLB.Builder         -> Maybe a
parseMaybeBS  :: Parse a => BS.ByteString       -> Maybe a
parseMaybeBSL :: Parse a => BSL.ByteString      -> Maybe a
parseMaybeBSB :: Parse a => BSB.Builder         -> Maybe a
parseMaybeSBS :: Parse a => SBS.ShortByteString -> Maybe a

The parseUnsafe functions are used to parse text that is already known to be valid:

parseUnsafe :: (Parse a, Textual t) => t -> a

parseUnsafeS   :: Parse a => String              -> a
parseUnsafeT   :: Parse a => T.Text              -> a
parseUnsafeTL  :: Parse a => TL.Text             -> a
parseUnsafeTLB :: Parse a => TLB.Builder         -> a
parseUnsafeBS  :: Parse a => BS.ByteString       -> a
parseUnsafeBSL :: Parse a => BSL.ByteString      -> a
parseUnsafeBSB :: Parse a => BSB.Builder         -> a
parseUnsafeSBS :: Parse a => SBS.ShortByteString -> a

`Parse` Utility Functions

The parseEnum function is used to parse a value of an enumerated type, using a Render instance to determine the textual representations of each value:

parseEnum
  :: (Bounded a, Enum a, Render a, Textual t)
  => Bool        -- ^ case-insensitive when 'True'
  -> Bool        -- ^ accept unique prefixes when 'True'
  -> e           -- ^ invalid input error
  -> e           -- ^ ambiguous input error
  -> t           -- ^ textual input to parse
  -> Either e a  -- ^ error or parsed value

The parseEnum' function calls parseEnum with some simple error messages:

parseEnum'
  :: (Bounded a, Enum a, Render a, Textual t, Textual e)
  => String      -- ^ name to include in error messages
  -> Bool        -- ^ case-insensitive when 'True'
  -> Bool        -- ^ accept unique prefixes when 'True'
  -> t           -- ^ textual input to parse
  -> Either e a  -- ^ error or parsed value

For example, parseEnum can be used to parse log levels:

import qualified Data.TTC as TTC

data LogLevel
  = Debug
  | Info
  | Warn
  | Error
  deriving (Bounded, Enum, Eq, Ord, Show)

instance TTC.Parse LogLevel where
  parse = TTC.parseEnum' "LogLevel" True False

instance TTC.Render LogLevel where
  render = TTC.renderWithShow

Some libraries use Read instances as the only way to parse the provided types. The following function is available for working with such types:

parseWithRead
  :: (Read a, Textual t)
  => e           -- ^ invalid input error
  -> t           -- ^ textual input to parse
  -> Either e a  -- ^ error or parsed value

As with parseEnum, the error message is taken as a parameter, and the parseWithRead' function calls parseWithRead with a simple error message:

parseWithRead'
  :: (Read a, Textual t, Textual e)
  => String      -- ^ name to include in error messages
  -> t           -- ^ textual input to parse
  -> Either e a  -- ^ error or parsed value

The maybeParseWithRead function uses a Read instance to parse to a Maybe type:

maybeParseWithRead
    :: (Read a, Textual t)
    => t        -- ^ textual input to parse
    -> Maybe a  -- ^ parsed value or 'Nothing' if invalid

The readsEnum function implements a ReadS instance using parseEnum, consuming all of the input:

readsEnum
  :: (Bounded a, Enum a, Render a)
  => Bool  -- ^ case-insensitive when 'True'
  -> Bool  -- ^ accept unique prefixes when 'True'
  -> ReadS a

The readsWithParse function implements a ReadS instance using a Parse instance, consuming all of the input:

readsWithParse :: Parse a => ReadS a

Changelog

June 10, 2021

The article is updated to accompany the release of TTC 1.1.0.0 with the following changes:

The auxiliary functions for builder and short types are added.
Default instances are added.
The Data.TTC.Instances module is removed.

Author

Travis Cardwell

Published

June 2, 2021

Revised

June 10, 2021

Series

TTC (Textual Type Classes)

Show and Read

Intended Use

String Usage

Read Errors

Type-Specific Functions

Render

Render Utility Functions

Parse

Parse Auxiliary Functions

Parse Utility Functions

Changelog

June 10, 2021

`Show` and `Read`

`String` Usage

`Read` Errors

`Render`

`Render` Utility Functions

`Parse`

`Parse` Auxiliary Functions

`Parse` Utility Functions