UTF-16 encoding support is wanted
CallmeNezha opened this issue · 1 comments
I'm using py-tree-sitter in my PyQt written text editor, and since Qt's QPlainTextEdit QTextDocument using UTF-16, I was wondering if you could add UTF-16 encoding support to this Python binding. Because tree-sitter supports UTF-16 and I see in binging.c it has UTF-8 as default input. Now I have to convert Qt's string to UTF-8 and pass it to py-tree-sitter's parser, and I have to calculate the correct position and byte position from the returned value and map it to the QTextDocument's cursor position. This conversion is complex and error-prone.
PR welcome.
This should be replaced with ts_parser_parse_string_encoding
based on the encoding of the buffer:
py-tree-sitter/tree_sitter/binding.c
Line 1523 in e9c956c
I don't know if the callable version has any way of finding the encoding:
py-tree-sitter/tree_sitter/binding.c
Line 1535 in e9c956c