Pandoc Filter Options
Pandoc converts documents from an input format to an output format via an abstract syntax tree (AST). It provides multiple ways to transform the AST using external programs, called “filters.” Lua filters are Lua programs that are interpreted within the Pandoc process, while JSON filters are arbitrary programs that are executed in a separate process with the AST communicated via standard I/O using JSON serialization. This blog entry describes a way to support filter options when implementing JSON filters in Haskell.
Pandoc provides ToJSONFilter
,
which takes care of JSON serialization and walking the AST, applying a
given transformation function. The interface is quite versatile.
Transformation functions can transform values of the following AST types
purely (a -> a
) or impurely
(a -> IO a
):
The current documentation states that Meta
and MetaValue
types are also supported, but I think that this is a mistake in the
documentation. I submitted a pull request with
a documentation fix, so I will soon find out if it is me that
is mistaken.
Alternatively, transformation functions can transform values to a
list of values (a -> [a]
or
a -> IO [a]
). For example, an Inline
Str
may be parsed into more than one Inline
value.
JSON filters can be
used in two ways. When used explicitly, one can use the
pandoc
command to output the JSON AST and pipe it to the
filter program. The filter program outputs the transformed JSON AST,
which can be piped to a separate pandoc
command that reads
the AST and outputs the document in the target format. Alternatively, a
single call to pandoc
can use a --filter
(or
-F
) option to specify the filter program. In this case,
Pandoc handles the filter program execution and JSON AST pipes for
you.
When using the --filter
option, Pandoc passes the target
format name as an argument to the filter program. ToJSONFilter
supports transformation functions that have a Maybe Format
argument for using this information. When the --filter
option is used, a value is always passed. When running the filter
program explicitly, the user may not pass a value, resulting in
Nothing
being passed to the transformation function.
When using Maybe Format
, any additional arguments are
ignored, but ToJSONFilter
also supports transformation functions that have a [String]
argument. All command-line arguments are passed to the transformation
function in this case.
This functionality is nice to have when quickly writing a filter
program, but it does not provide a good user interface. I prefer to
parse options outside of the transformation function, and I also prefer
to write programs that have --help
documentation.
Thankfully, this is easy to do by simply handling the arguments before
calling toJSONFilter
with the parsed options partially
applied to the transformation function.
For example, a transformation function may have the following type:
foo :: SomeOption -> AnotherOption -> Maybe Format -> Inline -> Inline
The Main
module can implement a full command-line
interface, using optparse-applicative
for example. The toJSONFilter
function is used
after the arguments have been parsed.
data Options
= Options
someOption :: !SomeOption
{ anotherOption :: !AnotherOption
, formatArgument :: !(Maybe String)
,
}
getOptions :: IO Options
= ...
getOptions
main :: IO ()
= do
main <- getOptions
opts $
toJSONFilter
foo
(someOption opts)
(anotherOption opts)Format . T.pack <$> formatArgument opts) (
If you would like to be able to use options even when using
--filter
, support can be added for setting options via
environment variables. The above code is non-executable, but I hope to
release a project soon that will serve as a concrete example of
implementing a Pandoc filter with options that can be configured using
command-line arguments as well as environment variables.