astral-sh/ruff

[`ruff`] Formatting hex codes changes output with f-string debug

Opened this issue · 9 comments

I had this idea while reading psf/black#4522 when @MichaReiser said ruff had already stabilized hex code formatting.

I don't actively use ruff/am not familiar with it, but I assume that just like black formatting code should not have an observable runtime effect.

playground link formatting f"{r'\xFF'=}" gives f"{r'\xff'=}"

>>> f"{r"\xFF"=}"
'r"ÿ"=\'\\\\xFF\''
>>> f"{r"\xff"=}"
'r"ÿ"=\'\\\\xff\''

Oh that's a nice find!

Funny enough, this is fixed in the new preview style playground

Is this a bug in the CPython parser? Why does it parse the hex code even though it is a raw string?

❯ uv run --python=3.12 python
>>> import ast

>>> f"{r"\xFF"=}"
'r"ÿ"=\'\\\\xFF\''

>>> print(ast.dump(ast.parse(r'f"{r"\xFF"=}"'), indent=2))
Module(
  body=[
    Expr(
      value=JoinedStr(
        values=[
          Constant(value='r"ÿ"='),
          FormattedValue(
            value=Constant(value='\\xFF'),
            conversion=114)]))],
  type_ignores=[])

>>> r"\xFF"
'\\xFF'

You mean why the hex code is interpreted in the debug expression? I don't know. It is surprising and I couldn't find anything in the f-string lexing section indicating why it should (it only says that it calls repr(expr)

I somewhat suspect that python parses everything before the = as a regular string and uses that in the debug expression because Python can't go back to the source for an expression, unlike we can.

You mean why the hex code is interpreted in the debug expression?

Yes

I somewhat suspect that python parses everything before the = as a regular string and uses that in the debug expression because Python can't go back to the source for an expression, unlike we can.

Yeah, that's what I thought as well but it's a bit surprising.

Yeah, that's what I thought as well but it's a bit surprising.

Definitely. It's not what I expected.

Should this be closed as it's "resolved" on main ?

We could. I kept it open to make it clear it's part of the Ruff 2025 style guide and it requires fixing if we, for whatever reason, decide not to stabilize f-string formatting.

Let me add a test for this :)

Is this a bug in the CPython parser?

I think so: I opened python/cpython#124363.