Overflow While Parsing IPv4 Addresses (Part 2)
In the Overflow
While Parsing IPv4 Addresses blog entry, I mentioned that I
submitted a GitHub issue to
report the overflow bug in the IPv4 address parser in the ip library. This
morning, Andrew Martin mentioned that he would accept a PR. I happened
to be thinking about folds
this morning, and parsing such Text
is usually done with a
fold, so I went ahead and implemented a fix
myself.
The core of the fix is the readOctet
function:
readOctet :: TextRead.Reader Word
= do
readOctet t let (digits, rest) = Text.span Char.isDigit t
= do
go d f n let n' = n * 10 + Char.ord d - 48
$ Char.isDigit d && n' <= 255
guard
f n'$ Left "octet does not start with a digit"
when (Text.null digits) case Text.foldr go Just digits 0 of
Just n -> Right (fromIntegral n, rest)
Nothing -> Left ipOctetSizeErrorMsg
(Note: This code is written to match the style of the ip source.)
Since this function is used in place of the decimal
function, it is designed to behave similarly and accept any number of
leading zeros. Note that it includes an improvement over the DecimalMaxStrict
implementation: the accumulator has an Int
type, allowing
for a single fromIntegral
call instead of one for each
digit.
For anybody who has not seen this style of fold before, it is worth analyzing. The type of foldr is as follows:
foldr :: Foldable t => (a -> b -> b) -> b -> t a -> b
The go
function is passed as the first argument. It has
the following type:
go :: Char -> (Int -> Maybe Int) -> Int -> Maybe Int
With some additional parenthesis, it is clear that this type matches
the type of the first parameter of foldr
:
go :: Char -> (Int -> Maybe Int) -> (Int -> Maybe Int)
The second argument to foldr
is Just
, which
has type Int -> Maybe Int
, and the third argument is
digits
, a Text
value (since this function uses
Text.foldr
) that folds over Char
items. The
foldr
function returns a function of type
Int -> Maybe Int
, to which the 0
gets
passed.
One of the remarkable properties of this style of fold is that it can terminate without processing all of the input. Though infinite input is not possible in this case, such folds can even be used with infinite input.
To fully understand how it works, you might want to work through the
evaluation of the following exercises using the above definition of
go
:
foldr go Just "1" 0
foldr go Just "111" 0
foldr go Just "1111" 0
foldr go Just (repeat '1') 0