BasicLambda int does not parse integers
chrisdone opened this issue · 2 comments
The example on master doesn't parse integers:
> :m + FlatParse.Examples.BasicLambda.Parser FlatParse.Basic
> runParser src' "123"
Err (Imprecise (Pos 3) [Msg "identifier",Lit "true",Lit "false",Msg "parenthesized expression",Msg "integer literal",Lit "let",Lit "lam",Lit "if"])
> runParser int "123"
Fail
I think the parser is too greedy:
int :: Parser Int
int = token $
snd <$> chainr (\n (!place, !acc) -> (place*10,acc+place*n)) digit ((10,) <$> digit)
chainr :: (a -> b -> b) -> Parser e a -> Parser e b -> Parser e b
chainr f (Parser elem) (Parser end) = go where
go = Parser \fp eob s -> case elem fp eob s of
OK# a s -> case runParser# go fp eob s of
OK# b s -> let !b' = f a b in OK# b' s
x -> x
Fail# -> end fp eob s
Err# e -> Err# e
{-# inline chainr #-}
chainr
consumes elem
(digit in this case) greedily until it Fails and then it tries to consume end
, which of course fails because there are no more integers to parse. So either you have to make the end
a no-op and use a lookahead to ensure there is at least one digit to consume, or perhaps pick another combinator over chainr
. Maybe you could have chainr1
that would omit an end parser.
Gross but functional example:
int :: F.Parser ParseError Int
int = do
_ <- F.lookahead digit
i <- snd <$>
(F.chainr
(\n (!place, !acc) -> (place * 10, acc + place * n))
digit
(pure (1,0)))
ws
pure i
> F.runParser int ""
Fail
> F.runParser int "00012"
OK 12 ""
You're right. I'm inclined to go with the following now:
int :: Parser Int
int = token do
(place, n) <- chainr (\n (!place, !acc) -> (place*10,acc+place*n)) digit (pure (1, 0))
case place of
1 -> empty
_ -> pure n
My general idea in the API is that list-returning functions should be de-emphasized in favor of fusable/inlinable combinators. I'm not exactly sure about the right set of such combinators. The above int
solution still looks kind of stiff.
To digress a bit, every solution we can do now has awful performance because of the Int
boxing, and ideally we'd have a levity polymorphic API for at least some basic methods and combinators. I've made an issue about this: #5
I prefer your version indeed. It measures the same performance as mine, so I'll use yours.
(No change to these numbers)
benchmarking parseText/Parser2/Text/array[1000]
time 43.24 μs (43.02 μs .. 43.48 μs)
0.997 R² (0.993 R² .. 1.000 R²)
mean 44.10 μs (43.07 μs .. 46.31 μs)
std dev 5.031 μs (2.197 μs .. 9.385 μs)
variance introduced by outliers: 87% (severely inflated)
benchmarking parseText/Parser2/Text/array[2000]
time 103.5 μs (103.2 μs .. 103.9 μs)
1.000 R² (1.000 R² .. 1.000 R²)
mean 103.8 μs (103.5 μs .. 104.4 μs)
std dev 1.493 μs (1.250 μs .. 1.896 μs)
benchmarking parseText/Parser2/Text/array[4000]
time 262.6 μs (261.5 μs .. 264.1 μs)
1.000 R² (1.000 R² .. 1.000 R²)
mean 262.7 μs (261.7 μs .. 263.8 μs)
std dev 3.554 μs (2.634 μs .. 4.579 μs)
My general idea in the API is that list-returning functions should be de-emphasized in favor of fusable/inlinable combinators. I'm not exactly sure about the right set of such combinators. The above int solution still looks kind of stiff.
This makes sense; I also don't know a good API for e.g. parsing into a vector.
Thanks!