mransan/ocaml-protoc

Your little float-literal bug

Opened this issue · 2 comments

I think the problem is that all three of the bits that precede ['e' 'e'] are optional. The typical way to deal with this problem is to force one of the second or third decimal bits to be non-empty. I don't have an example to hand of this sort of thing, but I think you can probably find one in the ocaml compiler sources. But there are lots of examples in lex specifications all over the place, I'm sure.

let float_literal =
  ['+' '-']? ['0'-'9']*
  ('.' ['0'-'9']* )?
  (['e' 'E'] ['+' '-']? ['0'-'9'] ['0'-'9']*)?
droyo commented

I'm sure you've seen this @mransan, but https://developers.google.com/protocol-buffers/docs/reference/proto3-spec contains the EBNF for proto files. You can do a pretty mechanical translation from the BNF notation to lex and yacc rules. For example, here is the part for float literals:

decimals  = decimalDigit { decimalDigit }
exponent  = ( "e" | "E" ) [ "+" | "-" ] decimals
floatLit = ( decimals "." [ decimals ] [ exponent ] | decimals exponent | "."decimals [ exponent ] ) | "inf" | "nan"

This translates almost verbatim to

let decimals = ['0' - '9']+
let exp = ( 'e' | 'E' ) ( '+' | '-' )? decimals
let float_literal = decimals '.' decimals? exp? | decimals exp | '.' decimals exp? | "inf" | "nan"

Which doesn't match "E1" and IMO is easier to read. For a more surgical approach I think you can get away with changing * to + on your first digit match.

Thanks, it's going to take me a bit of time to make the fix (laptop need setup) but I'll try soon.