R1707 (trailing-comma-tuple) checks perform excessive is_message_enabled calls
correctmost opened this issue · 3 comments
Bug description
RefactoringChecker::process_tokens
has a loop with an embedded self.linter.is_message_enabled("trailing-comma-tuple")
call:
pylint/pylint/checkers/refactoring/refactoring_checker.py
Lines 652 to 665 in 7521eb1
This call gets executed ~1.4 million times when running pylint on the yt-dlp repo. Hoisting the is_message_enabled
call outside of the loop can bring the number of executions down to ~1100.
Configuration
[MAIN]
jobs=1
[MESSAGES CONTROL]
disable=all
enable=R1707
[REPORTS]
reports=no
score=no
Command used
Steps to reproduce
git clone https://github.com/yt-dlp/yt-dlp.git
cd yt-dlp
git checkout 5904853ae5788509fdc4892cb7ecdfa9ae7f78e6
cat << EOF > ./profile_pylint.py
import cProfile
import pstats
import sys
sys.argv = ['pylint', '--recursive=y', '.']
cProfile.run('from pylint import __main__', filename='stats')
with open('profiler_stats', 'w', encoding='utf-8') as file:
stats = pstats.Stats('stats', stream=file)
stats.sort_stats('tottime')
stats.print_stats()
EOF
cat << EOF > .pylintrc
[MAIN]
jobs=1
[MESSAGES CONTROL]
disable=all
enable=R1707
[REPORTS]
reports=no
score=no
EOF
python ./profile_pylint.py
Analysis
process_tokens
calls is_message_enabled
~1.4 million times:
import pstats
stats = pstats.Stats('stats')
stats.print_callers('is_message_enabled')
ncalls tottime cumtime
1448094 1.515 3.509 pylint/checkers/refactoring/refactoring_checker.py:650(process_tokens)
Pylint output
There are some R1707 errors, but the output is less important than the performance numbers.
Expected behavior
Improved performance via reduced is_message_enabled
calls
Pylint version
astroid @ git+https://github.com/pylint-dev/astroid.git@0ccc2e29d4b9ce0f54f9e50fa6df85522083c5de
pylint @ git+https://github.com/pylint-dev/pylint.git@7521eb1dc6ac89fcf1763bee879d1207a87ddefa
Python 3.12.3
OS / Environment
Arch Linux
Additional dependencies
No response
Wow, this one is shocking, should have been a low hanging fruits during review. Thanks again, this is very valuable.
And in a performance MR of all thing 😄 #8606
Wait, no, we kinda need to do that on each line (not each token though, and we don't need to recheck everything). The enabling / disabling can be per line.