Render and Parse
Static type checking is a characteristic feature of the Haskell
programming language, and types play an important role in the design of
Haskell software. Distinct types, created using data
or
newtype
keywords, have a number of benefits over
general-purpose types. They can help developers avoid passing the wrong
value to a function, they provide more “documentation” than
general-purpose types, and they can greatly aid in refactoring. They can
also define some boundaries within an API, making large software easier
to understand.
Many data types have a natural textual representation, and one often needs to implement a function that “renders” a value to text. For example, the text may be displayed to the user, used in an API call, inserted into a database, or included in a log message. One also often needs to implement a function that “parses” text to a value. For example, the text may be user input, the response of an API call, or the result of a database lookup.
Haskell has many textual data types, described in the first article
of this series: Textual
Type Class. In most cases, each distinct data type renders/parses to
a specific textual data type, and that type can be converted to any
other textual data type that may be required. For example, an API call
might require a Text
value, a database insert might require
a ByteString
value, and a log message might require a
String
value.
The TTC
(Textual Type Classes) library provides a way to manage the
rendering and parsing of data types to multiple textual data types
without unnecessary conversion between textual data types and without
having to remember which textual data type is best used with which data
type. This article focuses on the Render
and
Parse
type classes.
As a minimal example, this article makes use of the following type.
Name
is defined using newtype
with an
underlying type of Text
.
module Example.Name where
import Data.Text (Text)
newtype Name = Name Text
Show
and Read
A good way to introduce the motivation for the Render
and Parse
type classes is to consider the problems with
using the Show
and Read
type classes that are
defined in the base
package. Both the Show
and Render
type classes
are used to render values as text, and both the Read
and
Parse
type classes are used to parse values from text.
Intended Use
The Show
and Read
type classes are intended
to work with text that contains Haskell source code. A Show
instance should render a string of valid Haskell code that can recreate
the type when parsed using a Read
instance. This can be
useful when debugging, but Haskell code is rarely the desired
representation for passing a value to an API or storing it in a
database.
There are three basic ways to create Show
and
Read
instances for the example Name
type. One
way is to derive the instances as follows.
module Example.Name where
import Data.Text (Text)
newtype Name = Name Text
deriving (Read, Show)
A value of this type is created using the Name
constructor, and the derived instances reflect this.
λ: show (Name "Bill")
"Name \"Bill\""
it :: String
λ: print (Name "Bill")
Name "Bill"
it :: ()
λ: read "Name \"Bill\"" :: Name
Name "Bill"
it :: Name
Another way to create Show
and Read
instances is to use some extensions to derive the instances based on the
wrapped type.
{-# LANGUAGE DerivingStrategies #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
module Example.Name where
import Data.Text (Text)
newtype Name = Name Text
deriving newtype (Read, Show)
The derived instances no longer use strings prefixed with the
constructor. Since there is no IsString
instance, note that
this definition deviates from the intended usage of Show
and Read
because the rendered strings are not valid Haskell
code for creating a value.
λ: show (Name "Bill")
"\"Bill\""
it :: String
λ: print (Name "Bill")
"Bill"
it :: ()
λ: read "\"Bill\"" :: Name
"Bill"
it :: Name
Note that the rendered strings are quoted. This involves escaping in cases where the rendered value includes quotation marks.
It is quite common for developers to use Show
and
Read
instances for their own purposes. For example, a
developer may want to render a Name
as the name itself,
without quotes. This can be done by writing the instances by hand.
module Example.Name where
import qualified Data.Text as T
import Data.Text (Text)
newtype Name = Name Text
instance Read Name where
readsPrec _ s = [(Name (T.pack s), "")]
instance Show Name where
show (Name name) = T.unpack name
Such instances deviate even more from the intended usage of
Show
and Read
, as the rendered strings are not
valid Haskell code.
λ: show (Name "Bill")
"Bill"
it :: String
λ: print (Name "Bill")
Bill
it :: ()
λ: read "Bill" :: Name
Bill
it :: Name
The use of Show
instances is a common source of bugs. It
is easy to derive an instance, and one often does so when prototyping
new software. Use of the derived instance with an API that does not
expect Haskell code can then cause issues.
As an example of such a bug in production software, a derived
Show
instance was used to render a string for a value in a
CSV file. The rendered string included quotation marks in the middle of
the string (like Name "Bill"
), which caused the CSV file to
be invalid. The invalid CSV file resulted in other failures, and the
bugs persisted for over two years before the use of the
Show
instance was identified and corrected.
String
Usage
Another problem with using Show
and Read
instances is that doing so utilizes the String
type. Since
String
is a type alias for a list of characters, it has
poor performance.
In the example, the Name
type has an underlying value of
type Text
. To render a Text
value to pass to
an API, one could use code like the following:
$ T.pack (show name) callApi
Sometimes people will even create a helper function for this because it is used so often.
showT :: Show a => a -> Text
= T.pack . show showT
The String
type is used only because of the use of
Show
.
Read
Errors
The use of Read
to parse introduces another issue: it
does not provide support for detailed error messages. The error message
provided by the readEither
function (defined in
Text.Read
) is almost completely unhelpful:
λ: import Text.Read (readEither)
λ: readEither "Test" :: Either String Name
Left "Prelude.read: no parse"
It can be frustrating when you run into such an error message while starting to work with a large project that you are unfamiliar with, as it gives no clues about what is not parsing. When using simple types, one generally provides a better error message when parsing:
λ: import Text.Read (readMaybe)
λ: maybe (Left "invalid Name") Right (readMaybe "Test" :: Maybe Name)
Left "invalid Name"
When parsing complex types, however, it is very helpful to use error
messages that provide details about why parsing fails. Parsing a
Name
could fail due to the use of an invalid character
(such as null) or due to length limitations. The Read
type
class does not allow for this.
Type-Specific Functions
Instead of using Show
and Read
, one can
write functions to render and parse each type. In the following example,
mkName
is a smart constructor that is used to ensure that a
Name
does not contain invalid characters, and
toText
renders a Name
as Text
by
simply unwrapping the underlying value.
module Example.Name
Name
(
, mkName
, toTextwhere
)
import Data.Char (isPrint)
import qualified Data.Text as T
import Data.Text (Text)
newtype Name = Name Text
mkName :: Text -> Either String Name
mkName t| T.all isPrint t = Right $ Name t
| otherwise = Left "invalid Name: contains invalid characters"
toText :: Name -> Text
Name name) = name toText (
Since the underlying type is Text
, the smart constructor
parses Text
and the toText
function renders
Text
. In cases where other textual types are required,
explicit conversion to/from Text
is necessary. This can be
done in several ways, and use of the Textual
Type Class provides a way with little boilerplate. The conversion
can be done at the location that an API function is called, or it could
be included in the API, as follows:
module Example.Name
Name
(
, mkName
, toTextwhere
)
import Control.Monad (unless)
import Data.Char (isPrint)
import qualified Data.Text as T
import Data.Text (Text)
import qualified Data.TTC as TTC
newtype Name = Name Text
mkName :: TTC.Textual t => t -> Either String Name
= TTC.asT $ \t -> do
mkName isPrint t) $
unless (T.all Left "invalid Name: contains invalid characters"
pure $ Name t
toText :: TTC.Textual t => Name -> t
Name name) = TTC.fromT name toText (
The Render
and Parse
type classes use this approach. The
use of type classes provides an interface that is consistent as well as
easy to use.
Render
The Render
type class renders a data type as a textual
data type.
class Render a where
render :: Textual t => a -> t
Instances may use the textual data type that is most natural. The
return value of the render
function uses the
Textual
type class to automatically convert to any
Textual
type. In cases where the type is ambiguous, one of
the following functions may be used:
renderS :: Render a => a -> String
renderT :: Render a => a -> T.Text
renderTL :: Render a => a -> TL.Text
renderTLB :: Render a => a -> TLB.Builder
renderBS :: Render a => a -> BS.ByteString
renderBSL :: Render a => a -> BSL.ByteString
renderBSB :: Render a => a -> BSB.Builder
renderSBS :: Render a => a -> SBS.ShortByteString
A Render
instance for the example Name
type
may be written as follows:
module Example.Name
Name
( where
)
import Data.Text (Text)
import qualified Data.TTC as TTC
newtype Name = Name Text
instance TTC.Render Name where
Name name) = TTC.fromT name render (
The TTC.render
function can then be used to render a
Name
value to a String
to print to the screen
as well as a Text
value to pass to some API call:
putStrLn $ "calling API for " ++ TTC.render name
$ TTC.render name callSomeApi
Since the underlying value of Name
is Text
,
that second call to TTC.convert
is evaluated to
id
and is optimized away. The developer can use
TTC.convert
without having to worry about such details.
While Show
instances should produce valid
Haskell code, Render
instances are intended to be used in
any way that the developer needs. The name of the type class was
selected to indicate that it is not necessarily isomorphic: the rendered
text does not need to contain all of the information required to
recreate the original value.
The Data.TTC
module does not include any
Render
instances, allowing developers to define instances
as required for each use case. For example, one may want to define a
Render
instance for the Int
type that formats
the integer in a specific way.
Default instances for some basic data types are defined for the
RenderDefault
type class, and you can load the
Render
instance as follows:
instance TTC.Render Int
Since a type may have at most one instance of a given type class, special care must be taken when defining type class instances in a shared library. In particular, orphan instances should generally not be used in shared libraries since they prevent users of the libraries from writing their own instances.
Render
Utility
Functions
Some libraries use Show
instances as the only way to
render the provided types. The following function is available for
working with such types:
renderWithShow :: (Show a, Textual t) => a -> t
Parse
The Parse
type class parses a data type from a textual
data type.
class Parse a where
parse :: (Textual t, Textual e) => t -> Either e a
Instances may use the textual data type that is most natural, using
one of the Textual
“as” functions for conversion. In cases
when the type is ambiguous, one of the following functions may be
used:
parseS :: (Parse a, Textual e) => String -> Either e a
parseT :: (Parse a, Textual e) => T.Text -> Either e a
parseTL :: (Parse a, Textual e) => TL.Text -> Either e a
parseTLB :: (Parse a, Textual e) => TLB.Builder -> Either e a
parseBS :: (Parse a, Textual e) => BS.ByteString -> Either e a
parseBSL :: (Parse a, Textual e) => BSL.ByteString -> Either e a
parseBSB :: (Parse a, Textual e) => BSB.Builder -> Either e a
parseSBS :: (Parse a, Textual e) => SBS.ShortByteString -> Either e a
A Parse
instance for the example Name
type
may be written as follows:
module Example.Name
Name
( where
)
import Control.Monad (unless)
import Data.Char (isPrint)
import qualified Data.Text as T
import Data.Text (Text)
import qualified Data.TTC as TTC
newtype Name = Name Text
instance TTC.Parse Name where
= TTC.asT $ \t -> do
parse isPrint t) . Left $
unless (T.all "invalid Name: contains invalid characters"
TTC.fromS pure $ Name t
Note that the only difference with the implementation of the smart
constructor in the Type-Specific
Functions example above is that the error message is also
Textual
.
While Read
instances should parse valid Haskell
code, Parse
instances are intended to be used in any way
that the developer needs. The Data.TTC
module does not
include any Parse
instances, allowing developers to define
instances as required for each use case.
Default instances for some basic data types are defined for the
ParseDefault
type class, and you can load the
Parse
instance as follows:
instance TTC.Parse Int
Since a type may have at most one instance of a given type class, special care must be taken when defining type class instances in a shared library. In particular, orphan instances should generally not be used in shared libraries since they prevent users of the libraries from writing their own instances.
Parse
Auxiliary
Functions
TTC provides some auxiliary functions for common use cases. The
parseMaybe
functions are used to parse text when the error
message is not needed:
parseMaybe :: Parse a => Textual t => t -> Maybe a
parseMaybeS :: Parse a => String -> Maybe a
parseMaybeT :: Parse a => T.Text -> Maybe a
parseMaybeTL :: Parse a => TL.Text -> Maybe a
parseMaybeTLB :: Parse a => TLB.Builder -> Maybe a
parseMaybeBS :: Parse a => BS.ByteString -> Maybe a
parseMaybeBSL :: Parse a => BSL.ByteString -> Maybe a
parseMaybeBSB :: Parse a => BSB.Builder -> Maybe a
parseMaybeSBS :: Parse a => SBS.ShortByteString -> Maybe a
The parseUnsafe
functions are used to parse text that is
already known to be valid:
parseUnsafe :: (Parse a, Textual t) => t -> a
parseUnsafeS :: Parse a => String -> a
parseUnsafeT :: Parse a => T.Text -> a
parseUnsafeTL :: Parse a => TL.Text -> a
parseUnsafeTLB :: Parse a => TLB.Builder -> a
parseUnsafeBS :: Parse a => BS.ByteString -> a
parseUnsafeBSL :: Parse a => BSL.ByteString -> a
parseUnsafeBSB :: Parse a => BSB.Builder -> a
parseUnsafeSBS :: Parse a => SBS.ShortByteString -> a
Parse
Utility
Functions
The parseEnum
function is used to parse a value of an
enumerated type, using a Render
instance to determine the
textual representations of each value:
parseEnum :: (Bounded a, Enum a, Render a, Textual t)
=> Bool -- ^ case-insensitive when 'True'
-> Bool -- ^ accept unique prefixes when 'True'
-> e -- ^ invalid input error
-> e -- ^ ambiguous input error
-> t -- ^ textual input to parse
-> Either e a -- ^ error or parsed value
The parseEnum'
function calls parseEnum
with some simple error messages:
parseEnum' :: (Bounded a, Enum a, Render a, Textual t, Textual e)
=> String -- ^ name to include in error messages
-> Bool -- ^ case-insensitive when 'True'
-> Bool -- ^ accept unique prefixes when 'True'
-> t -- ^ textual input to parse
-> Either e a -- ^ error or parsed value
For example, parseEnum
can be used to parse log
levels:
import qualified Data.TTC as TTC
data LogLevel
= Debug
| Info
| Warn
| Error
deriving (Bounded, Enum, Eq, Ord, Show)
instance TTC.Parse LogLevel where
= TTC.parseEnum' "LogLevel" True False
parse
instance TTC.Render LogLevel where
= TTC.renderWithShow render
Some libraries use Read
instances as the only way to
parse the provided types. The following function is available for
working with such types:
parseWithRead :: (Read a, Textual t)
=> e -- ^ invalid input error
-> t -- ^ textual input to parse
-> Either e a -- ^ error or parsed value
As with parseEnum
, the error message is taken as a
parameter, and the parseWithRead'
function calls
parseWithRead
with a simple error message:
parseWithRead' :: (Read a, Textual t, Textual e)
=> String -- ^ name to include in error messages
-> t -- ^ textual input to parse
-> Either e a -- ^ error or parsed value
The maybeParseWithRead
function uses a Read
instance to parse to a Maybe
type:
maybeParseWithRead :: (Read a, Textual t)
=> t -- ^ textual input to parse
-> Maybe a -- ^ parsed value or 'Nothing' if invalid
The readsEnum
function implements a ReadS
instance using parseEnum
, consuming all of the input:
readsEnum :: (Bounded a, Enum a, Render a)
=> Bool -- ^ case-insensitive when 'True'
-> Bool -- ^ accept unique prefixes when 'True'
-> ReadS a
The readsWithParse
function implements a
ReadS
instance using a Parse
instance,
consuming all of the input:
readsWithParse :: Parse a => ReadS a
Changelog
June 10, 2021
The article is updated to accompany the release of TTC
1.1.0.0
with the following changes:
- The auxiliary functions for builder and short types are added.
- Default instances are added.
- The
Data.TTC.Instances
module is removed.