GET Query Encoding
As described in New
Project: bm (WIP), bm can be
configured to perform queries. Since the software works by opening a URL
in a browser, only GET
queries are supported. Specifically,
GET
queries with a single variable parameter are supported.
Additional parameters may be specified in the configuration, with
constant values.
I noticed some interesting behavior in the initial implementation. Here is a trace of the example in the above blog entry:
$ bm -t ddg haskell LambdaCase
[firefox]
<ddg>
firefox https://duckduckgo.com/?q=haskell%20LambdaCase
When that URL is opened in a new browser tab, the server redirects to the following URL:
https://duckduckgo.com/?q=haskell+LambdaCase&ia=web
The initial implementation uses the network-uri
library to encode query parameters. It escapes spaces as
%20
, but spaces in query strings are usually encoded as
+
in the browser. DuckDuckGo redirects to a canonical
search URL. Some other sites do not, and the %20
is
unescaped to a space in the browser address bar, which looks weird.
Searching the network-uri
documentation, I was surprised
that it does not seem to support encoding spaces as +
in
query strings! I checked out the http-types
library, and it has the same behavior.
I researched the issue and was surprised to learn that
%20
is correct according to specifications while
+
is a convention used by browsers when submitting forms.
Encoding as %20
works in any part of a URL while encoding
as +
only works in the query, so people encourage always
using %20
. Here are some StackOverflow questions about this
topic:
The Haskell libraries follow the specifications, so they escape
spaces as %20
. In this case, however, the URL is opened in
a browser, so I would like to use browser conventions. I would also like
to avoid the redirect that is described above, if possible.
The network-uri
API makes it easy to handle spaces
specially. Here is my updated implementation of
encodeParameter
:
encodeParameter :: Parameter -> String
Parameter{..} = encodePart name ++ "=" ++ encodePart value
encodeParameter where
encodePart :: String -> String
encodePart= map (\c -> if c == ' ' then '+' else c)
. URI.escapeURIString ((||) <$> URI.isUnreserved <*> (== ' '))
To add the ia
parameter, I updated my bm
configuration as follows:
- keyword: ddg
url: https://duckduckgo.com
query:
action: https://duckduckgo.com/
hidden:
- name: ia
value: web
Here is a trace with this new implementation and configuration:
$ bm -t ddg haskell LambdaCase
[firefox]
<ddg>
firefox https://duckduckgo.com/?q=haskell+LambdaCase&ia=web
There is no more redirect!