Bad character count with unicode
PhML opened this issue · 1 comments
PhML commented
I don’t know at which level it fails: pylint, pylint-sonarjson or sonar.
Given the following configuration:
[MASTER]
load-plugins=pylint_sonarjson
disable=all
enable=W1309
[SONARQUBE]
output-format=sonarjson
and a file to check with unicode character ’
:
print(f"it’s here!")
the output will be:
{
"issues": [
{
"engineId": "PYLINT",
"ruleId": "W1309",
"type": "CODE_SMELL",
"primaryLocation": {
"message": "Using an f-string that does not have any interpolated variables",
"filePath": "file.py",
"textRange": {
"startLine": 1,
"startColumn": 6,
"endLine": 1,
"endColumn": 21
}
},
"severity": "MINOR",
"effortMinutes": 5
}
]
}
Then, sonar-report will fail with this error:
java.lang.IllegalArgumentException: 21 is not a valid line offset for pointer. File file.py has 20 character(s) at line 1
Indeed, my editor tells me it has 20 characters…
teake commented
I can reproduce this. It looks like a bug in astroid (the library that PyLint uses to parse the source code). Astroid counts the ’
unicode character as three columns (which is easier to see if you add more ’
s to the linted code in question). I've opened an issue upstream, pylint-dev/astroid#1744.
Since there's not much I can do about it here, I'm closing this one. Thanks for reporting it though!