cenotelie/hime

Regression for "Refactor NFA transition normalisation"

Closed this issue · 1 comments

It seems like e48e217 introduced a regression bug in NFA generation:

grammar C
{
	options
	{
		Axiom = "translation_unit";
		Separator = "SEPARATOR";
	}
	terminals
	{
		// A.1.1 Line terminators
		NEW_LINE		-> U+000D /* CR */
						|  U+000A /* LF */
						|  U+000D U+000A /* CR LF */
						|  U+0085 // Next line character
						|  U+2028 // Line separator character
						|  U+2029 ; //Paragraph separator character (U+2029)

		// A.1.2 White space
		WHITE_SPACE		-> uc{Zs} | U+0009 | U+000B | U+000C ;

		// A.1.3 Comments
		COMMENT_LINE	-> '//' (.* - (.* NEW_LINE .*)) ;
		COMMENT_BLOCK	-> '/*' (.* - (.* '*/' .*)) '*/' ;

		// A.1.6 Identifiers
		// fragment IDENTIFIER_CHAR		-> uc{Lu} | uc{Ll} | uc{Lt} | uc{Lm} | uc{Lo} | uc{Nl} ;
		// IDENTIFIER			-> (IDENTIFIER_CHAR | '_') (IDENTIFIER_CHAR | '_' | uc{Nd} | uc{Pc} | uc{Cf})* ;
		IDENTIFIER			->  [a-zA-Z_] [a-zA-Z0-9_]* ;

		// A.1.8 Literals
		INTEGER_LITERAL_DECIMAL		-> ('0' | [1-9] [0-9]*) ([Uu] [Ll]? | [Ll] [Uu]? )? ;
		INTEGER_LITERAL_HEXA		-> '0' [xX] [a-fA-F0-9]+ ([Uu] [Ll]? | [Ll] [Uu]? )? ;

		fragment EXPONENT -> [eE] ('+'|'-')? ('0' | [1-9] [0-9]*) ;
		fragment REAL_LITERAL_SUFFIX -> [FfDdMm] ;
		REAL_LITERAL				-> ('0' | [1-9] [0-9]*)? '.' ('0' | [1-9] [0-9]*) EXPONENT? REAL_LITERAL_SUFFIX?
									|  ('0' | [1-9] [0-9]*) EXPONENT  REAL_LITERAL_SUFFIX?
									|  ('0' | [1-9] [0-9]*) REAL_LITERAL_SUFFIX;
		fragment HEX_ESCAPE_LITERAL -> '\\' 'x' [a-fA-F0-9]{1,4}
											| '\\' [uU] [a-fA-F0-9]{4} ([a-fA-F0-9]{4})? ;
		CHARACTER_LITERAL			-> '\'' ( (. - ('\'' | '\\' | NEW_LINE))
											| '\\' ('\'' | '"' | '\\' | [0abfnrtv])
											| HEX_ESCAPE_LITERAL
										) '\'' ;
		STRING_LITERAL		-> '"'  ( (. - ('"' | '\\' | NEW_LINE))
											| '\\' ('\'' | '"' | '\'' | '\\' | [0abfnrtv])
											| HEX_ESCAPE_LITERAL
										)* '"' ;


		SEPARATOR		-> (NEW_LINE | WHITE_SPACE | COMMENT_LINE | COMMENT_BLOCK)+;
	}
}

Panic:

thread '<unnamed>' panicked at F:\Git\github.com\stevefan1999-personal\hime\sdk-rust\src\lib.rs:74:9:
  invalid char span: 2029-2028

Which corresponds to here:

						|  U+2028 // Line separator character
						|  U+2029 ; //Paragraph separator character (U+2029)

Thank you for your feedback, this is not fixed on master.