Nondeterministic generation of Rust code

Question

Nondeterministic generation of Rust code

Closed this issue a year ago · 2 comments

I noticed that grmtools does not deterministically generate the same token IDs when repeatedly running over the same specifications: https://github.com/FranklinChen/grmtools-nondeterministic-bug

I don't know whether to call this a bug, per se, but if token IDs are altered, then some tests I would like to do on expected parser failure will require a mapping file from token IDs back to a stable string representation.

Answer 1 · 2023-06-05T19:31:22.000Z

At the moment I'm fairly sure the token IDs are deterministic, though that's not guaranteed and not really intended (you can see the relevant code here).

I think what you're seeing in your example is something different: error recovery is nondeterministically doing Insert NUM or Insert UNKNOWN_TOKEN. That nondeterminism is inevitable (see Section 5.3).

I wonder if with UNKNOWN_TOKEN you intended to turn lexing errors into parsing errors?

Answer 2 · 2023-06-05T19:34:21.000Z

Oh, I see, the recovery is nondeterministic. And yes, I wanted the lexing error to turn into a parsing error. I'll close this and change my code.