Aeson 2 Object Coercion
The aeson-pretty library was recently updated to support aeson 2, so there are no more dependencies blocking ginger. I had a little bit of time yesterday evening, so I experimented with updating ginger to support aeson 2.
After first writing a simple implementation, I was surprised to discover that the aeson API includes support for coercions:
These are all Maybe
values, which provide the specified
Coercion
when appropriate. Currently, one of KeyMap.coercionToHashMap
and KeyMap.coercionToMap
contains a value, depending on the ordered-keymap
flag, and
Key.coercionToText
always contains a value. I wrote an implementation that uses these
values to convert from an aeson
Value
to a HashMap
or Map
, and
using the provided coercions when available, using Map
by
default if both KeyMap.coercionToHashMap
and KeyMap.coercionToMap
are Nothing
.
#if MIN_VERSION_aeson(2,0,0)
JSON.Object km) =
rawJSONToGVal (let keyToText = Prelude.maybe AK.toText Coercion.coerceWith AK.coercionToText
in case (AKM.coercionToHashMap, AKM.coercionToMap) of
Just coercion, _mMapCoercion) ->
(. HashMap.mapKeys keyToText $ Coercion.coerceWith (Coercion.sym coercion) km
toGVal Nothing, Just coercion) ->
(. Map.mapKeys keyToText $ Coercion.coerceWith (Coercion.sym coercion) km
toGVal Nothing, Nothing) ->
(. Map.fromList . List.map (first keyToText) $ AKM.toList km
toGVal #else
JSON.Object o) = toGVal o
rawJSONToGVal (#endif
(Note: This code is written to match the style of the ginger source.)
I created an issue to get
some feedback, and Tobias replied very quickly! He suggested always
using Map
in order to avoid coercions. That sounds like a
good idea to me, as these coercions make me uneasy as well. Besides,
users who need to maximize performance can convert objects to a
HashMap
themselves, in order to use the
HashMap
instance of ToGVal
.
#if MIN_VERSION_aeson(2,0,0)
JSON.Object km) = toGVal . Map.fromList . List.map (first AK.toText) $ AKM.toList km
rawJSONToGVal (#else
JSON.Object o) = toGVal o
rawJSONToGVal (#endif
(Note: This code is written to match the style of the ginger source.)
Today, I wrote some benchmarks to see how much the performance is affected. The benchmarks source is available on GitHub, but note that it uses my local clone of ginger, which has not been pushed yet.
The above coercions are not very satisfactory because the data
structure and the keys are coerced separately, which requires mapping
over the keys. For my benchmarks, I decided to use a larger hammer:
unsafeCoerce
. The idea is to use the above API as feature
flags to determine which conversion implementation to use, and to use
unsafeCoerce
to convert directly to the target type when
possible.
unsafeAesonToGVal :: forall m. A.Value -> Ginger.GVal m
#if MIN_VERSION_aeson(2,0,0)
=
unsafeAesonToGVal case (AK.coercionToText, AKM.coercionToHashMap, AKM.coercionToMap) of
Just{}, Just{}, _mMapCoercion) -> goHashMapText
(Just{}, _mHashMapCoercion, Just{}) -> goMapText
(Nothing, Just{}, _mMapCoercion) -> goHashMap
(-> goMap
(_mTextCoercion, _mHashMapCoercion, _mMapCoercion) where
...
#else
= Ginger.toGVal
unsafeAesonToGVal #endif
The goHashMapText
implementation coerces objects to
HashMap Text Value
.
goHashMapText :: A.Value -> Ginger.GVal m
= setAsJSON v $ case v of
goHashMapText v A.Number n -> Ginger.toGVal n
A.String s -> Ginger.toGVal s
A.Bool b -> Ginger.toGVal b
A.Null -> Ginger.toGVal ()
A.Array a -> Ginger.toGVal . map goHashMapText $ Vector.toList a
A.Object o -> Ginger.toGVal $
o :: HashMap Text A.Value) HashMap.map goHashMapText (unsafeCoerce
The goMapText
implementation coerces objects to
Map Text Value
.
goMapText :: A.Value -> Ginger.GVal m
= setAsJSON v $ case v of
goMapText v A.Number n -> Ginger.toGVal n
A.String s -> Ginger.toGVal s
A.Bool b -> Ginger.toGVal b
A.Null -> Ginger.toGVal ()
A.Array a -> Ginger.toGVal . map goMapText $ Vector.toList a
A.Object o -> Ginger.toGVal $
o :: Map Text A.Value) Map.map goMapText (unsafeCoerce
The goHashMap
implementation is used when aeson uses a
HashMap
but the key is not Text
.
goHashMap :: A.Value -> Ginger.GVal m
= setAsJSON v $ case v of
goHashMap v A.Number n -> Ginger.toGVal n
A.String s -> Ginger.toGVal s
A.Bool b -> Ginger.toGVal b
A.Null -> Ginger.toGVal ()
A.Array a -> Ginger.toGVal . map goHashMap $ Vector.toList a
A.Object o ->
. HashMap.fromList . map (bimap AK.toText goHashMap) $
Ginger.toGVal AKM.toList o
The goMap
implementation is used in other cases. It is
similar to the above implementation, but it uses bimap
to
transform both keys and values in a single pass.
goMap :: A.Value -> Ginger.GVal m
= setAsJSON v $ case v of
goMap v A.Number n -> Ginger.toGVal n
A.String s -> Ginger.toGVal s
A.Bool b -> Ginger.toGVal b
A.Null -> Ginger.toGVal ()
A.Array a -> Ginger.toGVal . map goMap $ Vector.toList a
A.Object o ->
. Map.fromList . map (bimap AK.toText goMap) $
Ginger.toGVal AKM.toList o
Helper function setAsJSON
is used to set the
asJSON
value in all of the implementations.
setAsJSON :: A.Value -> Ginger.GVal m -> Ginger.GVal m
= gv { Ginger.asJSON = Just v }
setAsJSON v gv {-# INLINE setAsJSON #-}
I prepared a
YAML file containing the names of numbers zero through ten in
various languages, and the benchmarks use a simple template that formats
this data in HTML. The benchmark program renders the template using
toGVal
and unsafeAesonToGVal
, and stack
configuration files are used to prepare various environments.
stack.yaml
uses aeson 1stack-aeson2-map.yaml
uses aeson 2 with theordered-keymap
flag set totrue
so thatMap
is usedstack-aeson2-hashmap.yaml
uses aeson 2 with theordered-keymap
flag set tofalse
so thatHashMap
is used
Here are the results:
toGVal (ms) |
unsafeAesonToGVal (ms) |
|
---|---|---|
aeson 1 | 10.72 | 10.79 |
aeson 2
Map |
11.11 | 11.04 |
aeson 2
HashMap |
11.18 | 11.12 |
The toGVal
column illustrates the cost of adding aeson 2 support to
ginger. For
this test data, the cost is measurable (almost half a millisecond more
time) but is a small percentage (around five percent) overhead. Note
that HashMap
likely performs worse than Map
in
this test because the objects in the test have a small number of pairs,
and the O(n log n)
processing of cheap operations for low
n
is less expensive than the calculation of hashes.
The unsafeAesonToGVal
column shows the performance of
the above unsightly function that makes use of
unsafeCoerce
. When using aeson 1, it simply
calls toGVal
, so that measurement is not interesting. The
other two measurements indicate that it is marginally faster than always
using Map
. The performance gain is not very significant,
however.
Why doesn’t unsafeAesonToGVal
always get aeson 1
performance? The recursive function is interleaved with calls to
toGVal
in order to create GVal
values. When
arrays and objects are converted, these data structures must be
traversed twice. The first traversal is deep; it converts all objects to
the target data structure during conversion to GVal
values.
The second traversal (in toGVal
) generally converts all
values to GVal
values. In this case all of the values are
already GVal
values, so it just does a shallow traversal
with no real work. Perhaps it would be possible to get closer to aeson 1 performance
by using unsafeAesonToGVal
in the implementation of
toGVal
, but it would not be a good idea to use
unsafeCoerce
in the library.
Given these benchmark results, I do not plan on making use of any coercion in an attempt to maximize aeson and ginger performance.