Allow polymorphic state in Stateful parser
dminuoso opened this issue · 4 comments
Right now, Stateful parsers have a monomorphic Int#
state, While it is enough to carry basic parser states, such as what parser mode we are in at the moment, it's not enough for when one wants to collect information such as a map of known identifiers for instance, or constructing a result during parsing. In our case it means we have to parse into an intermediate language, and interpret that instead, while with megaparsec we would just build a tree during parsing.
I've done a simple copy of the Stateful parser, making it polymorphic over state.
https://github.com/dminuoso/flatparse/tree/polymorphic-state
Benchmarks suggests that, strangely enough, it is consistently faster than Int# monomorphized state in sexp
and basic
benchmarks, slightly slower in numeral
benchmarks. Probably needs analysis on core for the first two cases.
benchmarked sexp/fpstateful
time 3.156 ms (3.150 ms .. 3.167 ms)
1.000 R² (1.000 R² .. 1.000 R²)
mean 3.174 ms (3.169 ms .. 3.180 ms)
std dev 16.21 μs (13.06 μs .. 20.49 μs)
benchmarked sexp/fppolystateful
time 2.932 ms (2.899 ms .. 2.966 ms)
0.999 R² (0.999 R² .. 1.000 R²)
mean 2.965 ms (2.951 ms .. 2.981 ms)
std dev 49.66 μs (41.52 μs .. 60.77 μs)
benchmarked long keyword/fpstateful
time 237.6 μs (227.6 μs .. 246.8 μs)
0.994 R² (0.991 R² .. 0.999 R²)
mean 230.7 μs (228.9 μs .. 233.5 μs)
std dev 7.238 μs (4.717 μs .. 10.35 μs)
variance introduced by outliers: 13% (moderately inflated)
benchmarked long keyword/fppolystateful
time 226.9 μs (225.4 μs .. 229.8 μs)
0.999 R² (0.998 R² .. 1.000 R²)
mean 226.8 μs (226.1 μs .. 228.1 μs)
std dev 2.950 μs (1.461 μs .. 4.728 μs)
benchmarked numeral csv/fpstateful
time 966.9 μs (954.9 μs .. 978.4 μs)
0.999 R² (0.998 R² .. 0.999 R²)
mean 975.5 μs (968.9 μs .. 980.9 μs)
std dev 20.86 μs (18.56 μs .. 23.16 μs)
benchmarked numeral csv/fppolystateful
time 1.035 ms (1.014 ms .. 1.056 ms)
0.998 R² (0.996 R² .. 0.999 R²)
mean 986.1 μs (980.2 μs .. 994.4 μs)
std dev 24.08 μs (17.61 μs .. 34.36 μs)
What are your thoughts?
P.S. But I am surprised that sexp/fpbasic
takes almost 30% more time than the stateful variants. Any insights here?
Thanks, I'll check out your branch. The performance looks pretty close (perhaps within margin of error), but IIRC the benchmarks don't actually use the threaded state. We should look at some benchmarks where the state is used non-trivially.
With GHC <9.4, the polymorphic state would be significantly worse than the monomorphic one, because those GHC versions are not able to do nested unboxing in return types. So an Int
state would generally remain boxed. It'd be worth to confirm though under GHC 9.4. that unboxing works.
I also note that to collect identifiers or information related to the current scope, it is generally not necessary to use the state, because the reader parameter is sufficient for that. In short, if we only have lexical nested scopes (which is the great majority of all programming languages) then scoping information can be stored in the reader. And the reader has the advantage that it can be unboxed reliably under any reasonable GHC version.
We have a simple solution then. Make it levity polymorphic over the state, and provide SPECIALIZE pragmas for s ~ Int# on every binding that is not inlined. Best of both worlds?
I've pushed a commit to the repo to reflect this idea.
Hah, it turns out there's apparently a GHC bug. SPECIALIZE does not seem to work with levity polymorphic things. But the amount of non-inlined bindings is minimal as it is.
Disregard all of that.
The PolyKinds does not kick in the way I thought it would. Turns out you cannot have levity polymorphic binders, which makes a bunch of relevant functions impossible to implement.