Your little float-literal bug
Opened this issue · 2 comments
I think the problem is that all three of the bits that precede ['e' 'e']
are optional. The typical way to deal with this problem is to force one of the second or third decimal bits to be non-empty. I don't have an example to hand of this sort of thing, but I think you can probably find one in the ocaml compiler sources. But there are lots of examples in lex specifications all over the place, I'm sure.
let float_literal =
['+' '-']? ['0'-'9']*
('.' ['0'-'9']* )?
(['e' 'E'] ['+' '-']? ['0'-'9'] ['0'-'9']*)?
I'm sure you've seen this @mransan, but https://developers.google.com/protocol-buffers/docs/reference/proto3-spec contains the EBNF for proto files. You can do a pretty mechanical translation from the BNF notation to lex and yacc rules. For example, here is the part for float literals:
decimals = decimalDigit { decimalDigit }
exponent = ( "e" | "E" ) [ "+" | "-" ] decimals
floatLit = ( decimals "." [ decimals ] [ exponent ] | decimals exponent | "."decimals [ exponent ] ) | "inf" | "nan"
This translates almost verbatim to
let decimals = ['0' - '9']+
let exp = ( 'e' | 'E' ) ( '+' | '-' )? decimals
let float_literal = decimals '.' decimals? exp? | decimals exp | '.' decimals exp? | "inf" | "nan"
Which doesn't match "E1" and IMO is easier to read. For a more surgical approach I think you can get away with changing *
to +
on your first digit match.
Thanks, it's going to take me a bit of time to make the fix (laptop need setup) but I'll try soon.