Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Make minor effeciency changes to regex usages #1133

Closed
wants to merge 1 commit into from

Conversation

sirosen
Copy link

@sirosen sirosen commented Jan 12, 2025

  • For static patterns which are used repeatedly, compile them in module scope
  • For identifier char checking, simply use a string collection membership check

These changes make jedi-vim faster by a small degree every time completions, rename, and show_call_signatures are invoked.


Ideally I'd have been able to give benchmarks to prove it's faster, but I haven't been able to get vspec working locally and therefore haven't gotten far on a local testing setup.
If these changes are unwanted, please don't hesitate to tell me. I'm happy to focus my attention elsewhere (though my vimscript is pretty weak).

- For static patterns which are used repeatedly, compile them in
  module scope
- For identifier char checking, simply use a string collection
  membership check

These changes make jedi-vim faster by a small degree every time
completions, rename, and show_call_signatures are invoked.
@davidhalter
Copy link
Owner

I'm pretty sure that string.letters + string.digits + "_" does not match the behavior of \w (besides the fact that the tests are failing).

I guess I am going to close for two reasons:

  1. This won't have a significant impact on performance, since Jedi itself will always be slower by factors of magnitude.
  2. While the change itself is fine in general, jedi-vim is unfortunately not well tested at all. So with these changes there is sometimes the risk that we overlook something that might cause bugs. Since jedi-vim is in maintenance mode and not used a lot anymore (most people use LSP these days for good reasons, even when using Jedi), it's IMO just an unnecessary risk.
  3. I have limited time. Testing this and other changes well is usually not worth it if it doesn't have a significant upside.

So I still want to thank you for trying, I don't think that this is a bad change at all and we probably should have done it like that in the beginning, but it's just not worth the hassle.

@sirosen
Copy link
Author

sirosen commented Jan 13, 2025

100% understood! I think that's a more than fair response.

I was trying to get my hands dirty with some valid improvement before trying to help in other ways, not put an extra burden on you. I'll keep an eye out for ways I can help where "the juice is worth the squeeze", so to speak.

Thanks (as always) for the awesome tools!


EDIT: Oh, and I guess I should also say that I'm pretty sure that the string usage matches \w, which is documented as [a-zA-Z0-9_], but I wrote letters, which is py2-only, when I meant ascii_letters! 🤦

@davidhalter
Copy link
Owner

\w is documented like

For Unicode (str) patterns:

    Matches Unicode word characters; this includes all Unicode alphanumeric characters (as defined by str.isalnum()), as well as the underscore (_).

    Matches [a-zA-Z0-9_] if the ASCII flag is used.

Which in this case includes a big amount of unicode letters.

@sirosen
Copy link
Author

sirosen commented Jan 14, 2025

You're 100% right. I was being clumsy in my thinking and assumed that identifiers are ascii-only. But they are not.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants