Render and Parse
Static type checking is a characteristic feature of the Haskell
programming language, and types play an important role in the design of
Haskell software. Distinct types, created using data or
newtype keywords, have a number of benefits over
general-purpose types. They can help developers avoid passing the wrong
value to a function, they provide more “documentation” than
general-purpose types, and they can greatly aid in refactoring. They can
also define some boundaries within an API, making large software easier
to understand.
Many data types have a natural textual representation, and one often needs to implement a function that “renders” a value to text. For example, the text may be displayed to the user, used in an API call, inserted into a database, or included in a log message. One also often needs to implement a function that “parses” text to a value. For example, the text may be user input, the response of an API call, or the result of a database lookup.
Haskell has many textual data types, described in the first article
of this series: Textual
Type Class. In most cases, each distinct data type renders/parses to
a specific textual data type, and that type can be converted to any
other textual data type that may be required. For example, an API call
might require a Text value, a database insert might require
a ByteString value, and a log message might require a
String value.
The TTC
(Textual Type Classes) library provides a way to manage the
rendering and parsing of data types to multiple textual data types
without unnecessary conversion between textual data types and without
having to remember which textual data type is best used with which data
type. This article focuses on the Render and
Parse type classes.
As a minimal example, this article makes use of the following type.
Name is defined using newtype with an
underlying type of Text.
module Example.Name where
import Data.Text (Text)
newtype Name = Name TextShow and Read
A good way to introduce the motivation for the Render
and Parse type classes is to consider the problems with
using the Show and Read type classes that are
defined in the base
package. Both the Show and Render type classes
are used to render values as text, and both the Read and
Parse type classes are used to parse values from text.
Intended Use
The Show and Read type classes are intended
to work with text that contains Haskell source code. A Show
instance should render a string of valid Haskell code that can recreate
the type when parsed using a Read instance. This can be
useful when debugging, but Haskell code is rarely the desired
representation for passing a value to an API or storing it in a
database.
There are three basic ways to create Show and
Read instances for the example Name type. One
way is to derive the instances as follows.
module Example.Name where
import Data.Text (Text)
newtype Name = Name Text
deriving (Read, Show)A value of this type is created using the Name
constructor, and the derived instances reflect this.
λ: show (Name "Bill")
"Name \"Bill\""
it :: String
λ: print (Name "Bill")
Name "Bill"
it :: ()
λ: read "Name \"Bill\"" :: Name
Name "Bill"
it :: Name
Another way to create Show and Read
instances is to use some extensions to derive the instances based on the
wrapped type.
{-# LANGUAGE DerivingStrategies #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
module Example.Name where
import Data.Text (Text)
newtype Name = Name Text
deriving newtype (Read, Show)The derived instances no longer use strings prefixed with the
constructor. Since there is no IsString instance, note that
this definition deviates from the intended usage of Show
and Read because the rendered strings are not valid Haskell
code for creating a value.
λ: show (Name "Bill")
"\"Bill\""
it :: String
λ: print (Name "Bill")
"Bill"
it :: ()
λ: read "\"Bill\"" :: Name
"Bill"
it :: Name
Note that the rendered strings are quoted. This involves escaping in cases where the rendered value includes quotation marks.
It is quite common for developers to use Show and
Read instances for their own purposes. For example, a
developer may want to render a Name as the name itself,
without quotes. This can be done by writing the instances by hand.
module Example.Name where
import qualified Data.Text as T
import Data.Text (Text)
newtype Name = Name Text
instance Read Name where
readsPrec _ s = [(Name (T.pack s), "")]
instance Show Name where
show (Name name) = T.unpack nameSuch instances deviate even more from the intended usage of
Show and Read, as the rendered strings are not
valid Haskell code.
λ: show (Name "Bill")
"Bill"
it :: String
λ: print (Name "Bill")
Bill
it :: ()
λ: read "Bill" :: Name
Bill
it :: Name
The use of Show instances is a common source of bugs. It
is easy to derive an instance, and one often does so when prototyping
new software. Use of the derived instance with an API that does not
expect Haskell code can then cause issues.
As an example of such a bug in production software, a derived
Show instance was used to render a string for a value in a
CSV file. The rendered string included quotation marks in the middle of
the string (like Name "Bill"), which caused the CSV file to
be invalid. The invalid CSV file resulted in other failures, and the
bugs persisted for over two years before the use of the
Show instance was identified and corrected.
String Usage
Another problem with using Show and Read
instances is that doing so utilizes the String type. Since
String is a type alias for a list of characters, it has
poor performance.
In the example, the Name type has an underlying value of
type Text. To render a Text value to pass to
an API, one could use code like the following:
callApi $ T.pack (show name)Sometimes people will even create a helper function for this because it is used so often.
showT :: Show a => a -> Text
showT = T.pack . showThe String type is used only because of the use of
Show.
Read Errors
The use of Read to parse introduces another issue: it
does not provide support for detailed error messages. The error message
provided by the readEither function (defined in
Text.Read) is almost completely unhelpful:
λ: import Text.Read (readEither)
λ: readEither "Test" :: Either String Name
Left "Prelude.read: no parse"
It can be frustrating when you run into such an error message while starting to work with a large project that you are unfamiliar with, as it gives no clues about what is not parsing. When using simple types, one generally provides a better error message when parsing:
λ: import Text.Read (readMaybe)
λ: maybe (Left "invalid Name") Right (readMaybe "Test" :: Maybe Name)
Left "invalid Name"
When parsing complex types, however, it is very helpful to use error
messages that provide details about why parsing fails. Parsing a
Name could fail due to the use of an invalid character
(such as null) or due to length limitations. The Read type
class does not allow for this.
Type-Specific Functions
Instead of using Show and Read, one can
write functions to render and parse each type. In the following example,
mkName is a smart constructor that is used to ensure that a
Name does not contain invalid characters, and
toText renders a Name as Text by
simply unwrapping the underlying value.
module Example.Name
( Name
, mkName
, toText
) where
import Data.Char (isPrint)
import qualified Data.Text as T
import Data.Text (Text)
newtype Name = Name Text
mkName :: Text -> Either String Name
mkName t
| T.all isPrint t = Right $ Name t
| otherwise = Left "invalid Name: contains invalid characters"
toText :: Name -> Text
toText (Name name) = nameSince the underlying type is Text, the smart constructor
parses Text and the toText function renders
Text. In cases where other textual types are required,
explicit conversion to/from Text is necessary. This can be
done in several ways, and use of the Textual
Type Class provides a way with little boilerplate. The conversion
can be done at the location that an API function is called, or it could
be included in the API, as follows:
module Example.Name
( Name
, mkName
, toText
) where
import Control.Monad (unless)
import Data.Char (isPrint)
import qualified Data.Text as T
import Data.Text (Text)
import qualified Data.TTC as TTC
newtype Name = Name Text
mkName :: TTC.Textual t => t -> Either String Name
mkName = TTC.asT $ \t -> do
unless (T.all isPrint t) $
Left "invalid Name: contains invalid characters"
pure $ Name t
toText :: TTC.Textual t => Name -> t
toText (Name name) = TTC.fromT nameThe Render and Parse type classes use this approach. The
use of type classes provides an interface that is consistent as well as
easy to use.
Render
The Render type class renders a data type as a textual
data type.
class Render a where
render :: Textual t => a -> tInstances may use the textual data type that is most natural. The
return value of the render function uses the
Textual type class to automatically convert to any
Textual type. In cases where the type is ambiguous, one of
the following functions may be used:
renderS :: Render a => a -> String
renderT :: Render a => a -> T.Text
renderTL :: Render a => a -> TL.Text
renderTLB :: Render a => a -> TLB.Builder
renderBS :: Render a => a -> BS.ByteString
renderBSL :: Render a => a -> BSL.ByteString
renderBSB :: Render a => a -> BSB.Builder
renderSBS :: Render a => a -> SBS.ShortByteStringA Render instance for the example Name type
may be written as follows:
module Example.Name
( Name
) where
import Data.Text (Text)
import qualified Data.TTC as TTC
newtype Name = Name Text
instance TTC.Render Name where
render (Name name) = TTC.fromT nameThe TTC.render function can then be used to render a
Name value to a String to print to the screen
as well as a Text value to pass to some API call:
putStrLn $ "calling API for " ++ TTC.render name
callSomeApi $ TTC.render nameSince the underlying value of Name is Text,
that second call to TTC.convert is evaluated to
id and is optimized away. The developer can use
TTC.convert without having to worry about such details.
While Show instances should produce valid
Haskell code, Render instances are intended to be used in
any way that the developer needs. The name of the type class was
selected to indicate that it is not necessarily isomorphic: the rendered
text does not need to contain all of the information required to
recreate the original value.
The Data.TTC module does not include any
Render instances, allowing developers to define instances
as required for each use case. For example, one may want to define a
Render instance for the Int type that formats
the integer in a specific way.
Default instances for some basic data types are defined for the
RenderDefault type class, and you can load the
Render instance as follows:
instance TTC.Render IntSince a type may have at most one instance of a given type class, special care must be taken when defining type class instances in a shared library. In particular, orphan instances should generally not be used in shared libraries since they prevent users of the libraries from writing their own instances.
Render Utility
Functions
Some libraries use Show instances as the only way to
render the provided types. The following function is available for
working with such types:
renderWithShow :: (Show a, Textual t) => a -> tParse
The Parse type class parses a data type from a textual
data type.
class Parse a where
parse :: (Textual t, Textual e) => t -> Either e aInstances may use the textual data type that is most natural, using
one of the Textual “as” functions for conversion. In cases
when the type is ambiguous, one of the following functions may be
used:
parseS :: (Parse a, Textual e) => String -> Either e a
parseT :: (Parse a, Textual e) => T.Text -> Either e a
parseTL :: (Parse a, Textual e) => TL.Text -> Either e a
parseTLB :: (Parse a, Textual e) => TLB.Builder -> Either e a
parseBS :: (Parse a, Textual e) => BS.ByteString -> Either e a
parseBSL :: (Parse a, Textual e) => BSL.ByteString -> Either e a
parseBSB :: (Parse a, Textual e) => BSB.Builder -> Either e a
parseSBS :: (Parse a, Textual e) => SBS.ShortByteString -> Either e aA Parse instance for the example Name type
may be written as follows:
module Example.Name
( Name
) where
import Control.Monad (unless)
import Data.Char (isPrint)
import qualified Data.Text as T
import Data.Text (Text)
import qualified Data.TTC as TTC
newtype Name = Name Text
instance TTC.Parse Name where
parse = TTC.asT $ \t -> do
unless (T.all isPrint t) . Left $
TTC.fromS "invalid Name: contains invalid characters"
pure $ Name tNote that the only difference with the implementation of the smart
constructor in the Type-Specific
Functions example above is that the error message is also
Textual.
While Read instances should parse valid Haskell
code, Parse instances are intended to be used in any way
that the developer needs. The Data.TTC module does not
include any Parse instances, allowing developers to define
instances as required for each use case.
Default instances for some basic data types are defined for the
ParseDefault type class, and you can load the
Parse instance as follows:
instance TTC.Parse IntSince a type may have at most one instance of a given type class, special care must be taken when defining type class instances in a shared library. In particular, orphan instances should generally not be used in shared libraries since they prevent users of the libraries from writing their own instances.
Parse Auxiliary
Functions
TTC provides some auxiliary functions for common use cases. The
parseMaybe functions are used to parse text when the error
message is not needed:
parseMaybe :: Parse a => Textual t => t -> Maybe a
parseMaybeS :: Parse a => String -> Maybe a
parseMaybeT :: Parse a => T.Text -> Maybe a
parseMaybeTL :: Parse a => TL.Text -> Maybe a
parseMaybeTLB :: Parse a => TLB.Builder -> Maybe a
parseMaybeBS :: Parse a => BS.ByteString -> Maybe a
parseMaybeBSL :: Parse a => BSL.ByteString -> Maybe a
parseMaybeBSB :: Parse a => BSB.Builder -> Maybe a
parseMaybeSBS :: Parse a => SBS.ShortByteString -> Maybe aThe parseUnsafe functions are used to parse text that is
already known to be valid:
parseUnsafe :: (Parse a, Textual t) => t -> a
parseUnsafeS :: Parse a => String -> a
parseUnsafeT :: Parse a => T.Text -> a
parseUnsafeTL :: Parse a => TL.Text -> a
parseUnsafeTLB :: Parse a => TLB.Builder -> a
parseUnsafeBS :: Parse a => BS.ByteString -> a
parseUnsafeBSL :: Parse a => BSL.ByteString -> a
parseUnsafeBSB :: Parse a => BSB.Builder -> a
parseUnsafeSBS :: Parse a => SBS.ShortByteString -> aParse Utility
Functions
The parseEnum function is used to parse a value of an
enumerated type, using a Render instance to determine the
textual representations of each value:
parseEnum
:: (Bounded a, Enum a, Render a, Textual t)
=> Bool -- ^ case-insensitive when 'True'
-> Bool -- ^ accept unique prefixes when 'True'
-> e -- ^ invalid input error
-> e -- ^ ambiguous input error
-> t -- ^ textual input to parse
-> Either e a -- ^ error or parsed valueThe parseEnum' function calls parseEnum
with some simple error messages:
parseEnum'
:: (Bounded a, Enum a, Render a, Textual t, Textual e)
=> String -- ^ name to include in error messages
-> Bool -- ^ case-insensitive when 'True'
-> Bool -- ^ accept unique prefixes when 'True'
-> t -- ^ textual input to parse
-> Either e a -- ^ error or parsed valueFor example, parseEnum can be used to parse log
levels:
import qualified Data.TTC as TTC
data LogLevel
= Debug
| Info
| Warn
| Error
deriving (Bounded, Enum, Eq, Ord, Show)
instance TTC.Parse LogLevel where
parse = TTC.parseEnum' "LogLevel" True False
instance TTC.Render LogLevel where
render = TTC.renderWithShowSome libraries use Read instances as the only way to
parse the provided types. The following function is available for
working with such types:
parseWithRead
:: (Read a, Textual t)
=> e -- ^ invalid input error
-> t -- ^ textual input to parse
-> Either e a -- ^ error or parsed valueAs with parseEnum, the error message is taken as a
parameter, and the parseWithRead' function calls
parseWithRead with a simple error message:
parseWithRead'
:: (Read a, Textual t, Textual e)
=> String -- ^ name to include in error messages
-> t -- ^ textual input to parse
-> Either e a -- ^ error or parsed valueThe maybeParseWithRead function uses a Read
instance to parse to a Maybe type:
maybeParseWithRead
:: (Read a, Textual t)
=> t -- ^ textual input to parse
-> Maybe a -- ^ parsed value or 'Nothing' if invalidThe readsEnum function implements a ReadS
instance using parseEnum, consuming all of the input:
readsEnum
:: (Bounded a, Enum a, Render a)
=> Bool -- ^ case-insensitive when 'True'
-> Bool -- ^ accept unique prefixes when 'True'
-> ReadS aThe readsWithParse function implements a
ReadS instance using a Parse instance,
consuming all of the input:
readsWithParse :: Parse a => ReadS aChangelog
June 10, 2021
The article is updated to accompany the release of TTC
1.1.0.0 with the following changes:
- The auxiliary functions for builder and short types are added.
- Default instances are added.
- The
Data.TTC.Instancesmodule is removed.