astral-sh/ruff

Feature request: Option to use lowercase hex literals

DavidBuchanan314 opened this issue · 6 comments

For a brief time window, black had this feature, before it was reverted psf/black#1692

I'm a big fan of lowercase hex literals, for most of the practical reasons outlined in the above issue, and also for purely aesthetic reasons - uppercase feels shouty.

If I had my way I'd make lowercase the default, but failing that it'd be great to have a configuration option.

Hi @DavidBuchanan314

If I understand correctly, this is about formatting hex number literals like 0xEF? I'm asking because Ruff normalizes hex numbers in string literals to lowercase Playground

My preference would be to change the default to lowercase to make it consistent with hex formatting in strings. However, from reading through Black's issues and discussions on the string hex formatting PRs, I understand that Black changed the preferred casing for hex number literals multiple times and that it's unlikely that they'll change it again in the future. I don't know if I want to deviate from Black's default, considering that the benefits are marginal.

That means that supporting lowercase hex formatting most likely requires adding a new formatter option. I'm not opposed to this, but we need to make a holistic decision if we want to support formatting-related options or keep Black's opinionated stance to have no-formatting-related settings (or very limited).

Yeah, it's the number literals I'm interested in. As for whether it's actually worth it, I guess I'll leave that to everyone else to decide :)

To me this kind of configuration is great. It shouldn't affect other formatting and it allows to adopt ruff formatting on an existing codebase that uses either style.

I would also like to propose three values for this kind of config:

  • uppercase
  • lowercase
  • dont_touch (any_case)

The last option let's the programmer choose case and don't modify it with the formatter - this should make the formatter faster 🐎

Piggybacking here a bit, but I was quite surprised when "Ruff format" changed the content of a hex string 1, and perplexed that the preference for string and number literal is different.

"don't touch" or "off" option is also great to introduce to existing code bases.

Footnotes

  1. That was on a project already currently checked with black. Maybe black just didn't detect that case. I know it also doesn't change the value, still surprising nonetheless.

I was just thinking about this again, and another argument for lower-case is that the hex() built-in produces lower-case hex.

>>> hex(123)
'0x7b'

I was looking for this option too. The formal representation for hexadecimal values in the ipython console is in lower case:

>>> 0xABB4AB8A
2880744330

>>> hex(2880744330)
'0xabb4ab8a'

>>> bin(200)
'0b11001000'

At a quick glance, the B and 8 can be mistaken and the 4 and A can also be mistaken. The characters are much more distinct in lower case...

For the exponent lower case e is preferred but black is consistent here:

>>>1E100
1e+100

The formal representation of strings also prefers single-quotations, unless a string literal is included:

>>> "hello world!"
'hello world!'

>>>'text = \'hello world!\''
"text = 'hello world!'"

The Ruff option quote-style fixes this:

[format]
# Prefer single quotes over double quotes.
quote-style = "single"

Lower case f is preferred for a formatted string and black is consistent here:

var = 'world'
f'hello {var}!'

The prefix for raw strings are not changed in black because code syntax highlighters differentiate R and r for file paths and regular expressions:

file_path = R'C:\Windows\System32'
email = r'[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}'

Having something similar to the following would be useful:

[format]
# Prefer single quotes over double quotes.
quote-style = 'single'
# Prefer lowercase in hexadecimal and binary.
hex-style = 'lower'

If there are other changes that black makes, that differ to the formal representation and the style seen in the official Python Documentation itself, then it might also be convenient for end users if these changes are grouped. For example:

# Match Python's official documentation code style
style='python'
# Match black code style
style='black'