DmitrySoshnikov/syntax

Provide a way to pass a character stream to the tokenizer or a way to preprocess the supplied text

fdutton opened this issue · 1 comments

PHP does not handle Unicode very well so I found it necessary to preprocess the supplied text by folding the Unicode characters into the ASCII character set using iconv('UTF-8', 'ASCII//TRANSLIT', $string).

Instead of passing a string to the parser, I would prefer passing a character stream or specifying a translation function so that I do not have to modify the generated code.

Yeah, instead of a single generic parse function on a parser in the template, feel free to add parseFromString (and default parse will just call it), and the parseFromCharStream to PHP plugin.