omegacen/pylint-sonarjson

Bad character count with unicode

PhML opened this issue · 1 comments

PhML commented

I don’t know at which level it fails: pylint, pylint-sonarjson or sonar.

Given the following configuration:

[MASTER]
load-plugins=pylint_sonarjson
disable=all
enable=W1309
[SONARQUBE]
output-format=sonarjson

and a file to check with unicode character :

print(f"it’s here!")

the output will be:

{
    "issues": [
        {
            "engineId": "PYLINT",
            "ruleId": "W1309",
            "type": "CODE_SMELL",
            "primaryLocation": {
                "message": "Using an f-string that does not have any interpolated variables",
                "filePath": "file.py",
                "textRange": {
                    "startLine": 1,
                    "startColumn": 6,
                    "endLine": 1,
                    "endColumn": 21
                }
            },
            "severity": "MINOR",
            "effortMinutes": 5
        }
    ]
}

Then, sonar-report will fail with this error:

java.lang.IllegalArgumentException: 21 is not a valid line offset for pointer. File file.py has 20 character(s) at line 1

Indeed, my editor tells me it has 20 characters…

teake commented

I can reproduce this. It looks like a bug in astroid (the library that PyLint uses to parse the source code). Astroid counts the unicode character as three columns (which is easier to see if you add more s to the linted code in question). I've opened an issue upstream, pylint-dev/astroid#1744.

Since there's not much I can do about it here, I'm closing this one. Thanks for reporting it though!