breuleux/jurigged

[Windows] Non-Latin characters are shown incorrectly upon function reload

Opened this issue · 1 comments

Hello! Thank you for the library! I'm having one issue which prevents me from using jurigged on regular basis. The issue is with non-Latin characters. But enough talk, check the code:

def make_greet_message(name):
    return f"Привет, {name}!"


def main():
    while True:
        name = input("Enter your name: ")
        print(make_greet_message(name))


if __name__ == '__main__':
    main()

On the line 3 the function returns greeting text with Russian word Привет (Hello) and everything works fine:

Enter your name: alex
Привет, alex!

However, if I change text to some other non-Latin text, like "Добрый день, {name}!" (Good day), the output looks differently:

Enter your name: alex
Добрый день, alex!

I cannot reproduce this issue on Manjaro Linux, however this happens on Windows 11 22H2 with Russian language as default. What should I change to fix this? Testing on integrated PowerShell in PyCharm 2023.3.3

I have the same problem
Possible root case can be python's funny preferred encoding on Windows (unix systems as far as I know use only utf-8):

# Windows 10 with Russian language as default
sys.getfilesystemencoding() # 'utf-8' 
locale.getpreferredencoding() # 'cp1252'

https://stackoverflow.com/questions/36303919/what-encoding-does-open-use-by-default
https://docs.python.org/3/library/functions.html#open

I also found some solutions:

  1. Setting env_var PYTHONUTF8=1;
  2. Setting encoding='utf-8' to every open #13
  3. Wait for PEP 686 (Python 3.15) (auto-fix)