softdevteam/grmtools

Nondeterministic generation of Rust code

Closed this issue · 2 comments

I noticed that grmtools does not deterministically generate the same token IDs when repeatedly running over the same specifications: https://github.com/FranklinChen/grmtools-nondeterministic-bug

I don't know whether to call this a bug, per se, but if token IDs are altered, then some tests I would like to do on expected parser failure will require a mapping file from token IDs back to a stable string representation.

ltratt commented

At the moment I'm fairly sure the token IDs are deterministic, though that's not guaranteed and not really intended (you can see the relevant code here).

I think what you're seeing in your example is something different: error recovery is nondeterministically doing Insert NUM or Insert UNKNOWN_TOKEN. That nondeterminism is inevitable (see Section 5.3).

I wonder if with UNKNOWN_TOKEN you intended to turn lexing errors into parsing errors?

Oh, I see, the recovery is nondeterministic. And yes, I wanted the lexing error to turn into a parsing error. I'll close this and change my code.