Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

R1707 (trailing-comma-tuple) checks perform excessive is_message_enabled calls #9608

Closed
correctmost opened this issue May 7, 2024 · 3 comments · Fixed by #9609 or #9620
Closed

R1707 (trailing-comma-tuple) checks perform excessive is_message_enabled calls #9608

correctmost opened this issue May 7, 2024 · 3 comments · Fixed by #9609 or #9620
Assignees
Labels
Enhancement ✨ Improvement to a component Needs PR This issue is accepted, sufficiently specified and now needs an implementation performance
Milestone

Comments

@correctmost
Copy link
Contributor

correctmost commented May 7, 2024

Bug description

RefactoringChecker::process_tokens has a loop with an embedded self.linter.is_message_enabled("trailing-comma-tuple") call:

for index, token in enumerate(tokens):
token_string = token[1]
if token_string == "elif":
# AST exists by the time process_tokens is called, so
# it's safe to assume tokens[index+1] exists.
# tokens[index+1][2] is the elif's position as
# reported by CPython and PyPy,
# token[2] is the actual position and also is
# reported by IronPython.
self._elifs.extend([token[2], tokens[index + 1][2]])
elif self.linter.is_message_enabled(
"trailing-comma-tuple"
) and _is_trailing_comma(tokens, index):
self.add_message("trailing-comma-tuple", line=token.start[0])

This call gets executed ~1.4 million times when running pylint on the yt-dlp repo. Hoisting the is_message_enabled call outside of the loop can bring the number of executions down to ~1100.

Configuration

[MAIN]
jobs=1

[MESSAGES CONTROL]
disable=all
enable=R1707

[REPORTS]
reports=no
score=no

Command used

Steps to reproduce

git clone https://github.com/yt-dlp/yt-dlp.git
cd yt-dlp
git checkout 5904853ae5788509fdc4892cb7ecdfa9ae7f78e6

cat << EOF > ./profile_pylint.py
import cProfile
import pstats
import sys

sys.argv = ['pylint', '--recursive=y', '.']
cProfile.run('from pylint import __main__', filename='stats')

with open('profiler_stats', 'w', encoding='utf-8') as file:
    stats = pstats.Stats('stats', stream=file)
    stats.sort_stats('tottime')
    stats.print_stats()
EOF

cat << EOF > .pylintrc
[MAIN]
jobs=1

[MESSAGES CONTROL]
disable=all
enable=R1707

[REPORTS]
reports=no
score=no
EOF

python ./profile_pylint.py

Analysis

process_tokens calls is_message_enabled ~1.4 million times:

import pstats

stats = pstats.Stats('stats')
stats.print_callers('is_message_enabled')

 ncalls  tottime  cumtime
1448094    1.515    3.509  pylint/checkers/refactoring/refactoring_checker.py:650(process_tokens)

Pylint output

There are some R1707 errors, but the output is less important than the performance numbers.

Expected behavior

Improved performance via reduced is_message_enabled calls

Pylint version

astroid @ git+https://github.com/pylint-dev/astroid.git@0ccc2e29d4b9ce0f54f9e50fa6df85522083c5de
pylint @ git+https://github.com/pylint-dev/pylint.git@7521eb1dc6ac89fcf1763bee879d1207a87ddefa
Python 3.12.3

OS / Environment

Arch Linux

Additional dependencies

No response

@correctmost correctmost added the Needs triage 📥 Just created, needs acknowledgment, triage, and proper labelling label May 7, 2024
@Pierre-Sassoulas Pierre-Sassoulas added Enhancement ✨ Improvement to a component performance Needs PR This issue is accepted, sufficiently specified and now needs an implementation and removed Needs triage 📥 Just created, needs acknowledgment, triage, and proper labelling labels May 7, 2024
@Pierre-Sassoulas Pierre-Sassoulas self-assigned this May 7, 2024
@Pierre-Sassoulas Pierre-Sassoulas added this to the 3.1.1 milestone May 7, 2024
@Pierre-Sassoulas
Copy link
Member

Wow, this one is shocking, should have been a low hanging fruits during review. Thanks again, this is very valuable.

@Pierre-Sassoulas
Copy link
Member

And in a performance MR of all thing 😄 #8606

Pierre-Sassoulas added a commit to Pierre-Sassoulas/pylint that referenced this issue May 7, 2024
@Pierre-Sassoulas
Copy link
Member

Wait, no, we kinda need to do that on each line (not each token though, and we don't need to recheck everything). The enabling / disabling can be per line.

@Pierre-Sassoulas Pierre-Sassoulas modified the milestones: 3.1.1, 3.2.0 May 7, 2024
Pierre-Sassoulas added a commit to Pierre-Sassoulas/pylint that referenced this issue May 8, 2024
Pierre-Sassoulas added a commit to Pierre-Sassoulas/pylint that referenced this issue May 8, 2024
@jacobtylerwalls jacobtylerwalls modified the milestones: 3.2.0, 3.3.0 May 12, 2024
Pierre-Sassoulas added a commit to Pierre-Sassoulas/pylint that referenced this issue May 13, 2024
Pierre-Sassoulas added a commit to Pierre-Sassoulas/pylint that referenced this issue May 13, 2024
Pierre-Sassoulas added a commit to Pierre-Sassoulas/pylint that referenced this issue May 13, 2024
Performance analysis by correctmost.  Follow up to fix the known limitation will be in pylint-dev#9609

Refs pylint-dev#8606 (follow-up)
Closes pylint-dev#9608
Pierre-Sassoulas added a commit to Pierre-Sassoulas/pylint that referenced this issue May 13, 2024
Performance analysis by correctmost.  Follow up to fix the known limitation will be in pylint-dev#9609

Refs pylint-dev#8606 (follow-up)
Closes pylint-dev#9608
@jacobtylerwalls jacobtylerwalls modified the milestones: 3.3.0, 3.2.0, 3.2.1 May 18, 2024
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Enhancement ✨ Improvement to a component Needs PR This issue is accepted, sufficiently specified and now needs an implementation performance
Projects
None yet
3 participants