Skip to main content

Overflow While Parsing IPv4 Addresses (Part 2)

In the Overflow While Parsing IPv4 Addresses blog entry, I mentioned that I submitted a GitHub issue to report the overflow bug in the IPv4 address parser in the ip library. This morning, Andrew Martin mentioned that he would accept a PR. I happened to be thinking about folds this morning, and parsing such Text is usually done with a fold, so I went ahead and implemented a fix myself.

The core of the fix is the readOctet function:

readOctet :: TextRead.Reader Word
readOctet t = do
  let (digits, rest) = Text.span Char.isDigit t
      go d f n = do
        let n' = n * 10 + Char.ord d - 48
        guard $ Char.isDigit d && n' <= 255
        f n'
  when (Text.null digits) $ Left "octet does not start with a digit"
  case Text.foldr go Just digits 0 of
    Just n  -> Right (fromIntegral n, rest)
    Nothing -> Left ipOctetSizeErrorMsg

(Note: This code is written to match the style of the ip source.)

Since this function is used in place of the decimal function, it is designed to behave similarly and accept any number of leading zeros. Note that it includes an improvement over the DecimalMaxStrict implementation: the accumulator has an Int type, allowing for a single fromIntegral call instead of one for each digit.

For anybody who has not seen this style of fold before, it is worth analyzing. The type of foldr is as follows:

foldr :: Foldable t => (a -> b -> b) -> b -> t a -> b

The go function is passed as the first argument. It has the following type:

go :: Char -> (Int -> Maybe Int) -> Int -> Maybe Int

With some additional parenthesis, it is clear that this type matches the type of the first parameter of foldr:

go :: Char -> (Int -> Maybe Int) -> (Int -> Maybe Int)

The second argument to foldr is Just, which has type Int -> Maybe Int, and the third argument is digits, a Text value (since this function uses Text.foldr) that folds over Char items. The foldr function returns a function of type Int -> Maybe Int, to which the 0 gets passed.

One of the remarkable properties of this style of fold is that it can terminate without processing all of the input. Though infinite input is not possible in this case, such folds can even be used with infinite input.

To fully understand how it works, you might want to work through the evaluation of the following exercises using the above definition of go:

  1. foldr go Just "1" 0
  2. foldr go Just "111" 0
  3. foldr go Just "1111" 0
  4. foldr go Just (repeat '1') 0