astral-sh/ruff

Allow character ranges or locales in `allowed-confusables` for RUF001, RUF002, & RUF003

Opened this issue · 2 comments

Rules RUF001, RUF002, and RUF003 help flag potentially ambiguous Unicode characters. While the configuration variable allowed-confusables can be used to specify allowed Unicode characters, it can become cumbersome to define allowed-confusables for projects whose docstrings use a lot of mathematical notation, or projects that are written in languages that use (for example) the Cyrillic alphabet.

It would be quite convenient if allowed-confusables could contain character ranges. Some possible use cases are:

allowed-confusables = ["α-ω"]  # lower case Greek letters for use in equations
allowed-confusables = ["∀-⋿"]  # Mathematical operators Unicode block
allowed-confusables = ["Ѐ-ӿ"]  # Cyrillic alphabet

Allowing character ranges would make it easier to configure exceptions to these rules. It'd also be less likely that we'd forget to include certain Unicode characters.

A related possibility would be to allow regular expressions, since there may be some use cases where a certain character should be allowed but only in a specific context.

Many thanks!

That does sound annoying. I just looked through VS code's confusable code and it seems to have some concept for locale confusables. That's why I'm not sure if allowing character ranges is the right solution. Maybe a better approach is to allow configuring the locale(s) and if mathematical operators should be allowed.

That does sound annoying. I just looked through VS code's confusable code and it seems to have some concept for locale confusables. That's why I'm not sure if allowing character ranges is the right solution. Maybe a better approach is to allow configuring the locale(s) and if mathematical operators should be allowed.

Very interesting! I hadn't thought of that! Defining locales (and being able to specify more than one) would make it even easier to configure allowed confusables. I updated the title to include that possibility.