malxau/yori

autocomplete for chinese path is weird

Opened this issue · 1 comments

z0ow commented

image
For example, there exists a directory "MCNP格式转换工具" in the Desktop directory, and the PWD is Desktop. If I typed "cd MC" and press TAB to autocomplete the residual characters, the display result looks weird.

I guess the behavior of yori is to display characters based on the same width (1 ASCII character width) but the Chinese characters occupied 2 widths.

Wish this bug could be fixed in the next update 😄

Yes, double width characters won't work well in any of these tools. I've looked into this before and they seem very difficult to handle correctly.

As far as I know, each character in the Unicode can be:

  • Single width
  • Double width
  • Ambiguous

This creates two things that are uncertain:

  • Ambiguous is defined as being uncertain. Each font can display the character differently. The only way to know how wide the character is is to examine the font used to display it. Now that the Windows Terminal is open source, we can see this logic at https://github.com/microsoft/terminal/blob/fb597ed304ec6eef245405c9652e9b8a029b821f/src/renderer/gdi/math.cpp#L41 .
  • We don't know if a font implements a given character. If it doesn't implement the character, a fallback character will be used, so the amount of space on the display is the width of the fallback character.

Both of these can only be known by firstly knowing what the display font is. It is very unnatural for a command line program to know what font is being used to display it. Strictly speaking, a Windows command line program running locally can know this by calling GetCurrentConsoleFontEx, and start calling into the UI libraries to determine what the font can do. As far as I know, it's not really possible to do this across SSH or another pipe that cannot communicate the font used to display the characters.

If there are better ways to handle this, I'd really like to hear about them. This can't be the first project to face these problems, but I don't see how layout can be correct unless the console and command line program agree - exactly - on how wide each character needs to be.