Skip to main content

Aeson 2 Object Coercion

The aeson-pretty library was recently updated to support aeson 2, so there are no more dependencies blocking ginger. I had a little bit of time yesterday evening, so I experimented with updating ginger to support aeson 2.

After first writing a simple implementation, I was surprised to discover that the aeson API includes support for coercions:

These are all Maybe values, which provide the specified Coercion when appropriate. Currently, one of KeyMap.coercionToHashMap and KeyMap.coercionToMap contains a value, depending on the ordered-keymap flag, and Key.coercionToText always contains a value. I wrote an implementation that uses these values to convert from an aeson Value to a HashMap or Map, and using the provided coercions when available, using Map by default if both KeyMap.coercionToHashMap and KeyMap.coercionToMap are Nothing.

#if MIN_VERSION_aeson(2,0,0)
rawJSONToGVal (JSON.Object km) =
    let keyToText = Prelude.maybe AK.toText Coercion.coerceWith AK.coercionToText
    in case (AKM.coercionToHashMap, AKM.coercionToMap) of
        (Just coercion, _mMapCoercion) ->
            toGVal . HashMap.mapKeys keyToText $ Coercion.coerceWith (Coercion.sym coercion) km
        (Nothing, Just coercion) ->
            toGVal . Map.mapKeys keyToText $ Coercion.coerceWith (Coercion.sym coercion) km
        (Nothing, Nothing) ->
            toGVal . Map.fromList . List.map (first keyToText) $ AKM.toList km
#else
rawJSONToGVal (JSON.Object o) = toGVal o
#endif

(Note: This code is written to match the style of the ginger source.)

I created an issue to get some feedback, and Tobias replied very quickly! He suggested always using Map in order to avoid coercions. That sounds like a good idea to me, as these coercions make me uneasy as well. Besides, users who need to maximize performance can convert objects to a HashMap themselves, in order to use the HashMap instance of ToGVal.

#if MIN_VERSION_aeson(2,0,0)
rawJSONToGVal (JSON.Object km) = toGVal . Map.fromList . List.map (first AK.toText) $ AKM.toList km
#else
rawJSONToGVal (JSON.Object o) = toGVal o
#endif

(Note: This code is written to match the style of the ginger source.)

Today, I wrote some benchmarks to see how much the performance is affected. The benchmarks source is available on GitHub, but note that it uses my local clone of ginger, which has not been pushed yet.

The above coercions are not very satisfactory because the data structure and the keys are coerced separately, which requires mapping over the keys. For my benchmarks, I decided to use a larger hammer: unsafeCoerce. The idea is to use the above API as feature flags to determine which conversion implementation to use, and to use unsafeCoerce to convert directly to the target type when possible.

unsafeAesonToGVal :: forall m. A.Value -> Ginger.GVal m
#if MIN_VERSION_aeson(2,0,0)
unsafeAesonToGVal =
    case (AK.coercionToText, AKM.coercionToHashMap, AKM.coercionToMap) of
      (Just{}, Just{}, _mMapCoercion) -> goHashMapText
      (Just{}, _mHashMapCoercion, Just{}) -> goMapText
      (Nothing, Just{}, _mMapCoercion) -> goHashMap
      (_mTextCoercion, _mHashMapCoercion, _mMapCoercion) -> goMap
  where
    ...
#else
unsafeAesonToGVal = Ginger.toGVal
#endif

The goHashMapText implementation coerces objects to HashMap Text Value.

goHashMapText :: A.Value -> Ginger.GVal m
goHashMapText v = setAsJSON v $ case v of
  A.Number n -> Ginger.toGVal n
  A.String s -> Ginger.toGVal s
  A.Bool b -> Ginger.toGVal b
  A.Null -> Ginger.toGVal ()
  A.Array a -> Ginger.toGVal . map goHashMapText $ Vector.toList a
  A.Object o -> Ginger.toGVal $
    HashMap.map goHashMapText (unsafeCoerce o :: HashMap Text A.Value)

The goMapText implementation coerces objects to Map Text Value.

goMapText :: A.Value -> Ginger.GVal m
goMapText v = setAsJSON v $ case v of
  A.Number n -> Ginger.toGVal n
  A.String s -> Ginger.toGVal s
  A.Bool b -> Ginger.toGVal b
  A.Null -> Ginger.toGVal ()
  A.Array a -> Ginger.toGVal . map goMapText $ Vector.toList a
  A.Object o -> Ginger.toGVal $
    Map.map goMapText (unsafeCoerce o :: Map Text A.Value)

The goHashMap implementation is used when aeson uses a HashMap but the key is not Text.

goHashMap :: A.Value -> Ginger.GVal m
goHashMap v = setAsJSON v $ case v of
  A.Number n -> Ginger.toGVal n
  A.String s -> Ginger.toGVal s
  A.Bool b -> Ginger.toGVal b
  A.Null -> Ginger.toGVal ()
  A.Array a -> Ginger.toGVal . map goHashMap $ Vector.toList a
  A.Object o ->
    Ginger.toGVal . HashMap.fromList . map (bimap AK.toText goHashMap) $
      AKM.toList o

The goMap implementation is used in other cases. It is similar to the above implementation, but it uses bimap to transform both keys and values in a single pass.

goMap :: A.Value -> Ginger.GVal m
goMap v = setAsJSON v $ case v of
  A.Number n -> Ginger.toGVal n
  A.String s -> Ginger.toGVal s
  A.Bool b -> Ginger.toGVal b
  A.Null -> Ginger.toGVal ()
  A.Array a -> Ginger.toGVal . map goMap $ Vector.toList a
  A.Object o ->
    Ginger.toGVal . Map.fromList . map (bimap AK.toText goMap) $
      AKM.toList o

Helper function setAsJSON is used to set the asJSON value in all of the implementations.

setAsJSON :: A.Value -> Ginger.GVal m -> Ginger.GVal m
setAsJSON v gv = gv { Ginger.asJSON = Just v }
{-# INLINE setAsJSON #-}

I prepared a YAML file containing the names of numbers zero through ten in various languages, and the benchmarks use a simple template that formats this data in HTML. The benchmark program renders the template using toGVal and unsafeAesonToGVal, and stack configuration files are used to prepare various environments.

  • stack.yaml uses aeson 1
  • stack-aeson2-map.yaml uses aeson 2 with the ordered-keymap flag set to true so that Map is used
  • stack-aeson2-hashmap.yaml uses aeson 2 with the ordered-keymap flag set to false so that HashMap is used

Here are the results:

toGVal (ms) unsafeAesonToGVal (ms)
aeson 1 10.72 10.79
aeson 2 Map 11.11 11.04
aeson 2 HashMap 11.18 11.12

The toGVal column illustrates the cost of adding aeson 2 support to ginger. For this test data, the cost is measurable (almost half a millisecond more time) but is a small percentage (around five percent) overhead. Note that HashMap likely performs worse than Map in this test because the objects in the test have a small number of pairs, and the O(n log n) processing of cheap operations for low n is less expensive than the calculation of hashes.

The unsafeAesonToGVal column shows the performance of the above unsightly function that makes use of unsafeCoerce. When using aeson 1, it simply calls toGVal, so that measurement is not interesting. The other two measurements indicate that it is marginally faster than always using Map. The performance gain is not very significant, however.

Why doesn’t unsafeAesonToGVal always get aeson 1 performance? The recursive function is interleaved with calls to toGVal in order to create GVal values. When arrays and objects are converted, these data structures must be traversed twice. The first traversal is deep; it converts all objects to the target data structure during conversion to GVal values. The second traversal (in toGVal) generally converts all values to GVal values. In this case all of the values are already GVal values, so it just does a shallow traversal with no real work. Perhaps it would be possible to get closer to aeson 1 performance by using unsafeAesonToGVal in the implementation of toGVal, but it would not be a good idea to use unsafeCoerce in the library.

Given these benchmark results, I do not plan on making use of any coercion in an attempt to maximize aeson and ginger performance.