Skip to main content

JSON/YAML Object Key Order Using Aeson

JSON and YAML are often used for configuration because these formats provide a standard way to specify structured data that is both widely recognized and widely supported. Objects are used to specify an unordered set of name-value pairs, but there are a number of cases where a fixed order in the text file greatly improves the usability of configuration because it makes it easier for users to read and understand it. This blog entry is about doing this when using the popular aeson library.

Both JSON and YAML are far from perfect, and there are many people who dislike using these formats for configuration. I too have worked with huge YAML configuration files where it is very difficult to keep track of the nested context. Regardless, ordering object name-value pairs can make such configuration at least a bit easier to read.

The aeson library uses the Value sum type to represent JSON values, and it represents Objects using a KeyMap data type, which is either a Map (from the containers package) or a HashMap (from the unordered-containers package), depending on the value of the ordered-keymap flag when the package is built. When the Map data type is used (when ordered-keymap is true), object name-value pairs are output in lexicographic order of the names, which generally looks like alphabetic order to users. When the HashMap data type is used (when ordered-keymap is false), object name-value pairs are output in the order of the hashes of the names, which generally looks like an arbitrary order to users. When name-value pairs are represented using any of these types, any user-defined order is lost and cannot be recovered.

The encode function encodes data as JSON, and the encodeFile function writes the JSON to a file. These functions create “minimal” JSON, with no unnecessary whitespace. It is trivial to output ordered object name-value pairs when using these functions, using the toEncoding method of the ToJSON type class. This method allows developers to directly encode data as JSON, without going through an Object type. It is used for increased performance, but it also allows the developer to specify the order of object name-value pairs: simply provide toEncoding implementations in your instances that output name-value pairs in the correct order.

The aeson-pretty library provides a way to encode data as JSON that uses whitespace (indentation and newlines) to make the JSON easier to read by humans. This library works by first converting the data to the Value representation and then traversing the Value, encoding it according to the configuration. Objects are represented using Object, so name-value pairs are ordered according to the KeyMap used, as described above. The Config provides a confCompare function that can be used to sort the name-value pairs, however, and the keyOrder function can be used to specify the order.

Using confCompare is trivial when you just need to order the name-value pairs of a single object. It gets more challenging when you need to order the name-value pairs of a very large/nested data structure, particularly when different objects use some of the same names. The confCompare function has to be able to sort all of the objects in the large/nested data structure, so the list of names passed to keyOrder must be compatible with all of them. Note that it is not possible to order some name a before b with one data type and b before a with another data type; the order of the names must be consistent across the whole data structure.

A general solution is to define the order for each object separately and combine them into a directed graph. When the resulting graph is acyclic, any topological ordering is a valid order to use with keyOrder. Note that this calculation can be done at build time using template-haskell.

The keyOrder function is implemented by indexing the list using elemIndex, which is O(n). Since this is used for comparison in a sort algorithm, I wonder if it might be worthwhile to use a HashMap instead. I have not benchmarked this yet, though.

The yaml library provides the same sort of configured comparison in the Data.Yaml.Pretty module. The aeson-pretty keyOrder function (and the same ordered list of names) may be used with this library.

I wrote a bit about this topic in Aeson Object Design (and Part 2). The techniques described in this blog entry can be used to order object name-value pairs when encoding JSON/YAML, as long as the order of names across the whole data structure is consistent. I still do not know of a way to decode JSON/YAML and maintain the order, however. This is sometimes required when processing arbitrary JSON/YAML or working with poorly designed JSON/YAML structures that “use values as keys.”

I am currently working on a project where I need to order object name-value pairs in output JSON/YAML. When the project is released, I will write a follow-up to this blog entry with discussion about how the code is organized to implement this, including links to the code.

Author

Travis Cardwell

Published

Tags
Related Blog Entries