Account tokens do not support unicode
Entze opened this issue · 4 comments
Entze commented
Hi,
I wanted to try out this parser on my ledger. My ledger is in my native language, which includes umlaute (ä,ö,ü), however the parser throws an exception parsing those.
A minimal example:
from beancount_parser.parser import make_parser
parser = make_parser()
parser.parse("1970-01-01 open Assets:Test:Erträge") # raises lark.exceptions.UnexpectedCharacters
I haven't tested it, but the parser will probably also not support accents (à,è) etc.
A fix shouldn't be that complicated, if you want me to, I can also submit a pull request.
Best regards.
fangpenlin commented
Oh, interesting, I didn't know that you can put unicode char as part of the account name. I will look into this issue later when I have a moment
fangpenlin commented
Thanks for reporting
huruka commented
I'm using chinese, simply edit account.lark to:ACCOUNT_CHAR: LETTER | DIGIT | "-" | /[\u4e00-\u9fa5]/
,for reference only
fangpenlin commented
close by #12