lezer-parser/lezer

Consider alternative approaches to forced reduction

marijnh opened this issue · 1 comments

Storing a single reduction per state in the parser tables is problematic, because a given state might contain inconsistent rules and rule positions, and blindly reducing any of them may not be valid in a given parse that is in the state. Patch lezer-parser/lr@e931d93 adds a check to avoid corruptions from this, but it might still (in circumstances that are probably quite obscure) produce weirdly structure output trees.

A possible alternative strategy would involve scanning up the stack for a state that has goto entries, and then use one of those to create a reduction. But there is is also tricky to make sure the reduced term actually corresponds to what is being reduced (since the parse tables don't provide us with that information).

A more expensive approach would be to store a set of terms with each state, and use that to pick the right goto entry when scanning up the stack. But again, if applied in the simplest way that doesn't seem to provide 100% guarantee against picking a nonsense reduction, since states don't uniquely determine how they were reached (and might even refer to several positions in the same rule).

In any case this will likely involve a breaking change to the on-disk format, so let's look into it when considering a new breaking release.