Overflow While Parsing IPv4 Addresses (Part 2)
In the Overflow
While Parsing IPv4 Addresses blog entry, I mentioned that I
submitted a GitHub issue to
report the overflow bug in the IPv4 address parser in the ip library. This
morning, Andrew Martin mentioned that he would accept a PR. I happened
to be thinking about folds
this morning, and parsing such Text is usually done with a
fold, so I went ahead and implemented a fix
myself.
The core of the fix is the readOctet function:
readOctet :: TextRead.Reader Word
readOctet t = do
let (digits, rest) = Text.span Char.isDigit t
go d f n = do
let n' = n * 10 + Char.ord d - 48
guard $ Char.isDigit d && n' <= 255
f n'
when (Text.null digits) $ Left "octet does not start with a digit"
case Text.foldr go Just digits 0 of
Just n -> Right (fromIntegral n, rest)
Nothing -> Left ipOctetSizeErrorMsg(Note: This code is written to match the style of the ip source.)
Since this function is used in place of the decimal
function, it is designed to behave similarly and accept any number of
leading zeros. Note that it includes an improvement over the DecimalMaxStrict
implementation: the accumulator has an Int type, allowing
for a single fromIntegral call instead of one for each
digit.
For anybody who has not seen this style of fold before, it is worth analyzing. The type of foldr is as follows:
foldr :: Foldable t => (a -> b -> b) -> b -> t a -> bThe go function is passed as the first argument. It has
the following type:
go :: Char -> (Int -> Maybe Int) -> Int -> Maybe IntWith some additional parenthesis, it is clear that this type matches
the type of the first parameter of foldr:
go :: Char -> (Int -> Maybe Int) -> (Int -> Maybe Int)The second argument to foldr is Just, which
has type Int -> Maybe Int, and the third argument is
digits, a Text value (since this function uses
Text.foldr) that folds over Char items. The
foldr function returns a function of type
Int -> Maybe Int, to which the 0 gets
passed.
One of the remarkable properties of this style of fold is that it can terminate without processing all of the input. Though infinite input is not possible in this case, such folds can even be used with infinite input.
To fully understand how it works, you might want to work through the
evaluation of the following exercises using the above definition of
go:
foldr go Just "1" 0foldr go Just "111" 0foldr go Just "1111" 0foldr go Just (repeat '1') 0