Skip to main content

GET Query Encoding

As described in New Project: bm (WIP), bm can be configured to perform queries. Since the software works by opening a URL in a browser, only GET queries are supported. Specifically, GET queries with a single variable parameter are supported. Additional parameters may be specified in the configuration, with constant values.

I noticed some interesting behavior in the initial implementation. Here is a trace of the example in the above blog entry:

$ bm -t ddg haskell LambdaCase
[firefox]
<ddg>
firefox https://duckduckgo.com/?q=haskell%20LambdaCase

When that URL is opened in a new browser tab, the server redirects to the following URL:

https://duckduckgo.com/?q=haskell+LambdaCase&ia=web

The initial implementation uses the network-uri library to encode query parameters. It escapes spaces as %20, but spaces in query strings are usually encoded as + in the browser. DuckDuckGo redirects to a canonical search URL. Some other sites do not, and the %20 is unescaped to a space in the browser address bar, which looks weird.

Searching the network-uri documentation, I was surprised that it does not seem to support encoding spaces as + in query strings! I checked out the http-types library, and it has the same behavior.

I researched the issue and was surprised to learn that %20 is correct according to specifications while + is a convention used by browsers when submitting forms. Encoding as %20 works in any part of a URL while encoding as + only works in the query, so people encourage always using %20. Here are some StackOverflow questions about this topic:

The Haskell libraries follow the specifications, so they escape spaces as %20. In this case, however, the URL is opened in a browser, so I would like to use browser conventions. I would also like to avoid the redirect that is described above, if possible.

The network-uri API makes it easy to handle spaces specially. Here is my updated implementation of encodeParameter:

encodeParameter :: Parameter -> String
encodeParameter Parameter{..} = encodePart name ++ "=" ++ encodePart value
  where
    encodePart :: String -> String
    encodePart
      = map (\c -> if c == ' ' then '+' else c)
      . URI.escapeURIString ((||) <$> URI.isUnreserved <*> (== ' '))

To add the ia parameter, I updated my bm configuration as follows:

- keyword: ddg
  url: https://duckduckgo.com
  query:
    action: https://duckduckgo.com/
    hidden:
      - name: ia
        value: web

Here is a trace with this new implementation and configuration:

$ bm -t ddg haskell LambdaCase
[firefox]
<ddg>
firefox https://duckduckgo.com/?q=haskell+LambdaCase&ia=web

There is no more redirect!

Author

Travis Cardwell

Published

Tags