tree-sitter/tree-sitter-php

variable name in chinese recognized as error

vpxyz opened this issue · 1 comments

vpxyz commented

Hi,
tree-sitter-php seems to be unable to recognize valid variable names with Chinese characters:

<?php
$漢字;

produces this syntax tree:

(program (php_tag)
 (ERROR $ (ERROR))
 (empty_statement ;))

Tested with tree-sitter 0.20.7 and the latest tree-sitter-php source tree.
Thank you

As #142 stated, it is now not conform to php implementation.

Even worse, php also allow non-utf8/16 encoding label/var name, tree-sitter won't allow it.

But for just Chinese characters, if encoded in utf8/16, it is possible to just change the rule to just support it may be just sufficient.

If other encoding is considred, you need to extend tree-sitter to support other encoding.