LaunchPlatform/beancount-parser

Account tokens do not support unicode

Entze opened this issue · 4 comments

Entze commented

Hi,

I wanted to try out this parser on my ledger. My ledger is in my native language, which includes umlaute (ä,ö,ü), however the parser throws an exception parsing those.

A minimal example:

from beancount_parser.parser import make_parser

parser = make_parser()

parser.parse("1970-01-01 open Assets:Test:Erträge") # raises lark.exceptions.UnexpectedCharacters

I haven't tested it, but the parser will probably also not support accents (à,è) etc.

A fix shouldn't be that complicated, if you want me to, I can also submit a pull request.

Best regards.

Oh, interesting, I didn't know that you can put unicode char as part of the account name. I will look into this issue later when I have a moment

Thanks for reporting

I'm using chinese, simply edit account.lark to:ACCOUNT_CHAR: LETTER | DIGIT | "-" | /[\u4e00-\u9fa5]/,for reference only

close by #12