omegacen/pylint-sonarjson

sonar-scanner import error when empty line

Turiok opened this issue · 4 comments

Hi @teake,

I'm working with @usinelogicielle team.
We detect an error using your plugin in a specific case.
We have rules in our SonarQube to add specific comment on the first lines of a module.
If an issue is detected on empty line sonar-scanner shown an error because the first and end pointer is the same :

ERROR: Error during SonarScanner execution java.lang.IllegalArgumentException: Start pointer [line=1, lineOffset=0] should be before end pointer [line=1, lineOffset=0] at org.sonar.api.utils.Preconditions.checkArgument(Preconditions.java:43) at org.sonar.api.batch.fs.internal.DefaultInputFile.newRangeValidPointers(DefaultInputFile.java:348) at org.sonar.api.batch.fs.internal.DefaultInputFile.newRange(DefaultInputFile.java:281) at org.sonar.scanner.externalissue.ExternalIssueImporter.fillLocation(ExternalIssueImporter.java:133) at org.sonar.scanner.externalissue.ExternalIssueImporter.importIssue(ExternalIssueImporter.java:81) at org.sonar.scanner.externalissue.ExternalIssueImporter.execute(ExternalIssueImporter.java:57) at org.sonar.scanner.externalissue.ExternalIssuesImportSensor.execute(ExternalIssuesImportSensor.java:74) at org.sonar.scanner.sensor.AbstractSensorWrapper.analyse(AbstractSensorWrapper.java:48) at org.sonar.scanner.sensor.ModuleSensorsExecutor.execute(ModuleSensorsExecutor.java:85) at org.sonar.scanner.sensor.ModuleSensorsExecutor.lambda$execute$1(ModuleSensorsExecutor.java:59) at org.sonar.scanner.sensor.ModuleSensorsExecutor.withModuleStrategy(ModuleSensorsExecutor.java:77) at org.sonar.scanner.sensor.ModuleSensorsExecutor.execute(ModuleSensorsExecutor.java:59) at org.sonar.scanner.scan.ModuleScanContainer.doAfterStart(ModuleScanContainer.java:82) at org.sonar.core.platform.ComponentContainer.startComponents(ComponentContainer.java:137) at org.sonar.core.platform.ComponentContainer.execute(ComponentContainer.java:123) at org.sonar.scanner.scan.ProjectScanContainer.scan(ProjectScanContainer.java:392) at org.sonar.scanner.scan.ProjectScanContainer.scanRecursively(ProjectScanContainer.java:388) at org.sonar.scanner.scan.ProjectScanContainer.doAfterStart(ProjectScanContainer.java:357) at org.sonar.core.platform.ComponentContainer.startComponents(ComponentContainer.java:137) at org.sonar.core.platform.ComponentContainer.execute(ComponentContainer.java:123) at org.sonar.scanner.bootstrap.GlobalContainer.doAfterStart(GlobalContainer.java:150) at org.sonar.core.platform.ComponentContainer.startComponents(ComponentContainer.java:137) at org.sonar.core.platform.ComponentContainer.execute(ComponentContainer.java:123) at org.sonar.batch.bootstrapper.Batch.doExecute(Batch.java:72) at org.sonar.batch.bootstrapper.Batch.execute(Batch.java:66) at org.sonarsource.scanner.api.internal.batch.BatchIsolatedLauncher.execute(BatchIsolatedLauncher.java:46) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:567) at org.sonarsource.scanner.api.internal.IsolatedLauncherProxy.invoke(IsolatedLauncherProxy.java:60) at com.sun.proxy.$Proxy0.execute(Unknown Source) at org.sonarsource.scanner.api.EmbeddedScanner.doExecute(EmbeddedScanner.java:189) at org.sonarsource.scanner.api.EmbeddedScanner.execute(EmbeddedScanner.java:138) at org.sonarsource.scanner.cli.Main.execute(Main.java:112) at org.sonarsource.scanner.cli.Main.execute(Main.java:75) at org.sonarsource.scanner.cli.Main.main(Main.java:61)

Reading the SonarQube documentation https://docs.sonarqube.org/8.9/analysis/generic-issue/
startColumn field is optionnal. So on empty line it shouldn't be fullfilled.

I tested it and it works. I'll propose a solution

teake commented

Hang on, when the end column is zero it doesn't get added to the JSON output:

if hasattr(msg, "end_column") and msg.end_column:
sonar_dict["primaryLocation"]["textRange"]["endColumn"] = msg.end_column

If msg.end_column is 0, the conditional on line 56 does not evaluate to true (which is not correct in other circumstances; that logic should probably be changed).

So I'm a little puzzled how SonarQube can spit out the above error on empty lines. Can you provide the JSON output of pylint-sonarjson and the code on which it is generated?

Sorry for the late answer.

Here the incriminated code :


"""
Classe Database Interrogation
"""

import ast

tree = ast.parse("print('à')")
for node in ast.walk(tree):
    print(node)
    print(node.__dict__)
    print("children: " + str([x for x in ast.iter_child_nodes(node)]) + "\\n")

Here the configuratiion file :

[MASTER]
# Specify a configuration file.
#rcfile=

# Python code to execute, usually for sys.path manipulation such as # pygtk.require().
#init-hook=

# Profiled execution.
profile=no

# Add files or directories to the blacklist. They should be base names, not # paths.
ignore=CVS

# Pickle collected data for later comparisons.
persistent=yes

# List of plugins (as comma separated values of python modules names) to load, 
# usually to register additional checkers.
load-plugins=cnes_checker,pylint.extensions.check_elif,pylint_sonarjson


[MESSAGES CONTROL]


# Enable the message, report, category or checker with the given id(s). You can
# either give multiple identifier separated by comma (,) or put this option
# multiple time. See also the "--disable" option for examples.
# 2015/07/09 : C0103 (invalid-name/PY.NAME.Convention), R0801 (duplicate-code/COM.PROJECT.CodeCloning) 
#              and R0915 (too-many-statements/COM.MET.LineOfCode) rules enabled
#              In addition, C0102 and W0141 have to be enabled for bad-names and bad-functions rules
disable=all
enable=R5104,W0703,R5402,R5401,R5105,R0203,R0204,R5403,R5106,R0401,W0102,W9097,E0108,R5103,F0002,W9096,C0204,C0203,E0213,C0122,W9095,W0312,W0406,R5101,W0622,W0621,W0404,W0403,R5201,R5301,R5302,R5102,R0915,W0602,E0602,C0412,W0612,C0113,W0603,E0601,C0411,C0413,C0326

max-bool-expr=5
max-returns=1
max-nested-blocks=5

# Disable the message, report, category or checker with the given id(s). You
# can either give multiple identifiers separated by comma (,) or put this
# option multiple times (only on the command line, not in the configuration
# file where it should appear only once).You can also use "--disable=all" to
# disable everything first and then reenable specific checks. For example, if
# you want to run only the similarities checker, you can use "--disable=all
# --enable=similarities". If you want to run only the classes checker, but have
# no Warning level messages displayed, use"--disable=all --enable=classes
# --disable=W"


[REPORTS]

# Set the output format. Available formats are text, parseable, colorized, msvs
# (visual studio) and html. You can also give a reporter class, eg
# mypackage.mymodule.MyReporterClass.
output-format=text

# Put messages in a separate file for each module / package specified on the
# command line instead of printing them on stdout. Reports (if any) will be
# written in a file name "pylint_global.[txt|html]".
files-output=no

# Tells whether to display a full report or only the messages
reports=yes

# Python expression which should return a note less than 10 (10 is the highest
# note). You have access to the variables errors warning, statement which
# respectively contain the number of errors / warnings messages and the total
# number of statements analyzed. This is used by the global evaluation report
# (RP0004).
evaluation=10.0 - ((float(5 * error + warning + refactor + convention) / statement) * 10)

# Add a comment according to your evaluation note. This is used by the global
# evaluation report (RP0004).
comment=no

# Template used to display messages. This is a python new-style format string
# used to format the message information. See doc for all details
#msg-template=


[BASIC]

# Required attributes for module, separated by a comma
required-attributes=

# List of builtins function names that should not be used, separated by a comma
bad-functions=map,filter,apply,input

# Regular expression which should only match correct module names
module-rgx=(([a-z_][a-z0-9_]*)|([A-Z][a-zA-Z0-9]+))$

# Regular expression which should only match correct module level names
const-rgx=(([A-Z_][A-Z0-9_]*)|(__.*__))$

# Regular expression which should only match correct class names
class-rgx=[A-Z_][a-zA-Z0-9]+$

# Regular expression which should only match correct function names
function-rgx=[a-z_][a-z0-9_]{2,30}$

# Regular expression which should only match correct method names
method-rgx=[a-z_][a-z0-9_]{2,30}$

# Regular expression which should only match correct instance attribute names
attr-rgx=[a-z_][a-z0-9_]{2,30}$

# Regular expression which should only match correct argument names
argument-rgx=[a-z_][a-z0-9_]{2,30}$

# Regular expression which should only match correct variable names
variable-rgx=[a-z_][a-z0-9_]{2,30}$

# Regular expression which should only match correct attribute names in class
# bodies
class-attribute-rgx=([A-Za-z_][A-Za-z0-9_]{2,30}|(__.*__))$

# Regular expression which should only match correct list comprehension /
# generator expression variable names
inlinevar-rgx=[A-Za-z_][A-Za-z0-9_]*$

# Good variable names which should always be accepted, separated by a comma
good-names=i,j,k,ex,Run,_

# Bad variable names which should always be refused, separated by a comma
bad-names=foo,bar,baz,toto,tutu,tata

# Regular expression which should only match function or class names that do
# not require a docstring.
no-docstring-rgx=__.*__

# Minimum line length for functions/classes that require docstrings, shorter
# ones are exempt.
docstring-min-length=-1


[FORMAT]

# Maximum number of characters on a single line.
max-line-length=100

# Regexp for a line that is allowed to be longer than the limit.
ignore-long-lines=^\s*(# )?<?https?://\S+>?$

# Allow the body of an if to be on the same line as the test if there is no
# else.
single-line-if-stmt=no

# List of optional constructs for which whitespace checking is disabled
no-space-check=trailing-comma,dict-separator

# Maximum number of lines in a module
max-module-lines=1000

# String used as indentation unit. This is usually " " (4 spaces) or "\t" (1
# tab).
indent-string='    '


[SIMILARITIES]

# Minimum lines number of a similarity.
min-similarity-lines=4

# Ignore comments when computing similarities.
ignore-comments=yes

# Ignore docstrings when computing similarities.
ignore-docstrings=yes

# Ignore imports when computing similarities.
ignore-imports=no


[VARIABLES]

# Tells whether we should check for unused import in __init__ files.
init-import=no

# A regular expression matching the beginning of the name of dummy variables
# (i.e. not used).
dummy-variables-rgx=_$|dummy

# List of additional names supposed to be defined in builtins. Remember that
# you should avoid to define new builtins when possible.
additional-builtins=


[TYPECHECK]

# Tells whether missing members accessed in mixin class should be ignored. A
# mixin class is detected if its name ends with "mixin" (case insensitive).
ignore-mixin-members=yes

# List of classes names for which member attributes should not be checked
# (useful for classes with attributes dynamically set).
ignored-classes=SQLObject

# When zope mode is activated, add a predefined set of Zope acquired attributes
# to generated-members.
zope=no

# List of members which are set dynamically and missed by pylint inference
# system, and so shouldn't trigger E0201 when accessed. Python regular
# expressions are accepted.
generated-members=REQUEST,acl_users,aq_parent


[MISCELLANEOUS]

# List of note tags to take in consideration, separated by a comma.
notes=FIXME,XXX,TODO


[CLASSES]

# List of interface methods to ignore, separated by a comma. This is used for
# instance to not check methods defines in Zope's Interface base class.
ignore-iface-methods=isImplementedBy,deferred,extends,names,namesAndDescriptions,queryDescriptionFor,getBases,getDescriptionFor,getDoc,getName,getTaggedValue,getTaggedValueTags,isEqualOrExtendedBy,setTaggedValue,isImplementedByInstancesOf,adaptWith,is_implemented_by

# List of method names used to declare (i.e. assign) instance attributes.
defining-attr-methods=__init__,__new__,setUp

# List of valid names for the first argument in a class method.
valid-classmethod-first-arg=cls

# List of valid names for the first argument in a metaclass class method.
valid-metaclass-classmethod-first-arg=mcs


[DESIGN]

# Maximum number of arguments for function / method
max-args=5

# Argument names that match this expression will be ignored. Default to name
# with leading underscore
ignored-argument-names=_.*

# Maximum number of locals for function / method body
max-locals=15

# Maximum number of return / yield for function / method body
max-returns=1

# Maximum number of branch for function / method body
max-branches=12

# Maximum number of statements in function / method body
max-statements=80

# Add for RNC CNES
accept-no-param-doc=n
max-mccabe-number=15
max-simplified-mccabe-number=12
max-nested-blocks=6
min-func-comments-ratio=20
min-module-comments-ratio=20
min-func-size-to-check-comments=10
max-bool-expr=5
max-decorators=5

# Maximum number of parents for a class (see R0901).
max-parents=7

# Maximum number of attributes for a class (see R0902).
max-attributes=7

# Minimum number of public methods for a class (see R0903).
min-public-methods=2

# Maximum number of public methods for a class (see R0904).
max-public-methods=20


[IMPORTS]

# Deprecated modules which should not be used, separated by a comma
deprecated-modules=regsub,TERMIOS,Bastion,rexec

# Create a graph of every (i.e. internal and external) dependencies in the
# given file (report RP0402 must not be disabled)
import-graph=

# Create a graph of external dependencies in the given file (report RP0402 must
# not be disabled)
ext-import-graph=

# Create a graph of internal dependencies in the given file (report RP0402 must
# not be disabled)
int-import-graph=


[EXCEPTIONS]

# Exceptions that will emit a warning when being caught. Defaults to
# "Exception"
overgeneral-exceptions=Exception


[SONARQUBE]

# Define SonarQube rules to enable importing Pylint results with customized
# severity and type. Option sonar.externalIssuesReportPaths must be used with
# sonar-scanner. To generate the json file, package pylint-sonarjson must be
# installed.
# For more information see: https://github.com/omegacen/pylint-sonarjson
sonar-rules=C0113:MINOR:10:CODE_SMELL,C0122:MINOR:10:CODE_SMELL,C0203:MINOR:10:CODE_SMELL,C0204:MINOR:10:CODE_SMELL,C0326:MINOR:10:CODE_SMELL,C0411:MINOR:10:CODE_SMELL,C0412:MINOR:10:CODE_SMELL,C0413:MINOR:10:CODE_SMELL,E0108:MINOR:10:CODE_SMELL,E0213:MAJOR:10:CODE_SMELL,E0601:MAJOR:10:CODE_SMELL,E0602:MAJOR:10:CODE_SMELL,F0002:MAJOR:10:CODE_SMELL,R0203:MINOR:10:CODE_SMELL,R0204:CRITICAL:10:CODE_SMELL,R0401:MINOR:10:CODE_SMELL,R0915:MINOR:10:CODE_SMELL,R5101:CRITICAL:10:CODE_SMELL,R5102:CRITICAL:10:CODE_SMELL,R5103:CRITICAL:10:CODE_SMELL,R5104:CRITICAL:10:CODE_SMELL,R5105:CRITICAL:10:CODE_SMELL,R5106:CRITICAL:10:CODE_SMELL,R5201:CRITICAL:10:CODE_SMELL,R5301:CRITICAL:10:CODE_SMELL,R5302:CRITICAL:10:CODE_SMELL,R5401:BLOCKER:10:CODE_SMELL,R5402:BLOCKER:10:CODE_SMELL,R5403:BLOCKER:10:CODE_SMELL,W0102:MINOR:10:CODE_SMELL,W0312:MAJOR:10:CODE_SMELL,W0403:MINOR:10:CODE_SMELL,W0404:MINOR:10:CODE_SMELL,W0406:MINOR:10:CODE_SMELL,W0602:MINOR:10:CODE_SMELL,W0603:MINOR:10:CODE_SMELL,W0612:MINOR:10:CODE_SMELL,W0621:MINOR:10:CODE_SMELL,W0622:MINOR:10:CODE_SMELL,W0703:MINOR:10:CODE_SMELL,W9095:CRITICAL:10:CODE_SMELL,W9096:CRITICAL:10:CODE_SMELL,W9097:CRITICAL:10:CODE_SMELL
output-format=sonarjson

Here the commands :

pip install cnes-pylint-extension 
pip install pylint-sonarjson==1.0.5
pylint ast_example.py --rcfile=pylintrc_RNC2015_C --exit-zero --halt-on-invalid-sonar-rules n

Here the output :

{
    "issues": [
        {
            "engineId": "PYLINT",
            "ruleId": "W9095",
            "type": "CODE_SMELL",
            "primaryLocation": {
                "message": "\"author\" field missing from ast_example docstring",
                "filePath": "ast_example.py",
                "textRange": {
                    "startLine": 1,
                    "startColumn": 0
                }
            },
            "severity": "CRITICAL",
            "effortMinutes": 10
        },
        {
            "engineId": "PYLINT",
            "ruleId": "W9095",
            "type": "CODE_SMELL",
            "primaryLocation": {
                "message": "\"version\" field missing from ast_example docstring",
                "filePath": "ast_example.py",
                "textRange": {
                    "startLine": 1,
                    "startColumn": 0
                }
            },
            "severity": "CRITICAL",
            "effortMinutes": 10
        },
        {
            "engineId": "PYLINT",
            "ruleId": "W9095",
            "type": "CODE_SMELL",
            "primaryLocation": {
                "message": "\"date\" field missing from ast_example docstring",
                "filePath": "ast_example.py",
                "textRange": {
                    "startLine": 1,
                    "startColumn": 0
                }
            },
            "severity": "CRITICAL",
            "effortMinutes": 10
        }
    ]
}

The 3 issues start at line 1 but line 1 is empty. So sonar-scanner crash searching line code

teake commented

Thanks for the code, I can reproduce your JSON output.

To be honest, this looks like a issue in SonarQube to me. If you omit the endColumn but provide the startColumn, SonarQube will set the end column equal to the last column of the line (ExternalIssueImporter.java#L126). But then later on SonarQube will check for a non-zero length text interval (DefaultInputFile.java#L365), which in your case is zero, and then it raises the Start pointer should be before end pointer error.

This does not happen when the startColumn is omitted, because SonarQube will then not set the endColumn equal to the last column of the line (ExternalIssueImporter.java#L121).

There's no way to fix this properly in the pylint plugin. Simply omitting the startColumn when it is equal to 0 like in #8 is a workaround for when the line is empty. But when the line is not empty, this workaround is not needed and even hurtful (as it throws away relevant information). But there's no way of knowing whether the line is empty in the pylint plugin. That information is simply not available there without resorting to parsing the file, which is out of scope for the plugin.

So I'm afraid I'll have to ask to report this upstream to SonarQube. If they respond with something like "you have to always provide startColumn and endColumn together" then I'm happy to make the corresponding changes here. If they respond with improving their code, even better.

I have some final questions for you though: why does the module in your example start with an empty line? And why does cnes-pylint-extension report an issue with the opening empty line and not with the first line of the docstring, line 2? The latter looks like a bug in cnes-pylint-extension to me.

Hi @teake ,

We created a post on SonarQube Forum.
Hoping a correction will be done : https://community.sonarsource.com/t/error-on-first-empty-lines/103509

You can close the issue