microsoft/vscode-autopep8

Incorrectly formatting strings containing emojis

papple23g opened this issue ยท 3 comments

Hello,

I am reporting an issue with the vscode autopep8 module, specifically version v2023.8.0. This problem also involves the standalone autopep8 tool, version v2.0.4.

Here is the scenario:

When I have a Python file in vscode containing the following content:

{"๐Ÿ’" }

After using vscode's autopep8 module to format the code, the output becomes incorrectly formatted as follows:

{"๐Ÿ’ }

In contrast, when I directly run autopep8.exe [file] --in-place on the command line, I get the correct formatting:

{"๐Ÿ’"}

It seems like there's an issue with vscode's integration of the autopep8 module when handling strings containing emojis. The direct execution of autopep8.exe correctly preserves the string format, but the vscode module does not.

I would appreciate any assistance in resolving this issue.

Thank you.

@papple23g Try the pre-release version of autopep8, it has the fix for this (try version 2023.9.10221010). The issue occurs because VS Code uses UTF-16 code units when measuring string lengths, while python uses UTF-8 characters as string length.

Verification steps:

  1. Install the latest pre-release of autopep8 extension
  2. Create something.py:
s = {"๐Ÿ’"   }
  1. Trigger formatting with autopep8 using Format document with.
  2. It should update to:
s = {"๐Ÿ’"}

Verification steps:

  1. Install the latest pre-release of autopep8 extension
  2. Create something.py:
s = {"๐Ÿ’"   }
  1. Trigger formatting with autopep8 using Format document with.
  2. It should update to:
s = {"๐Ÿ’"}

@karthiknadig I followed your guidance and tested the pre-release. It worked flawlessly and resolved the issue. Thanks!