Make minor effeciency changes to regex usages #1133

sirosen · 2025-01-12T07:16:11Z

For static patterns which are used repeatedly, compile them in module scope
For identifier char checking, simply use a string collection membership check

These changes make jedi-vim faster by a small degree every time completions, rename, and show_call_signatures are invoked.

Ideally I'd have been able to give benchmarks to prove it's faster, but I haven't been able to get vspec working locally and therefore haven't gotten far on a local testing setup.
If these changes are unwanted, please don't hesitate to tell me. I'm happy to focus my attention elsewhere (though my vimscript is pretty weak).

- For static patterns which are used repeatedly, compile them in module scope - For identifier char checking, simply use a string collection membership check These changes make jedi-vim faster by a small degree every time completions, rename, and show_call_signatures are invoked.

davidhalter · 2025-01-13T19:45:09Z

I'm pretty sure that string.letters + string.digits + "_" does not match the behavior of \w (besides the fact that the tests are failing).

I guess I am going to close for two reasons:

This won't have a significant impact on performance, since Jedi itself will always be slower by factors of magnitude.
While the change itself is fine in general, jedi-vim is unfortunately not well tested at all. So with these changes there is sometimes the risk that we overlook something that might cause bugs. Since jedi-vim is in maintenance mode and not used a lot anymore (most people use LSP these days for good reasons, even when using Jedi), it's IMO just an unnecessary risk.
I have limited time. Testing this and other changes well is usually not worth it if it doesn't have a significant upside.

So I still want to thank you for trying, I don't think that this is a bad change at all and we probably should have done it like that in the beginning, but it's just not worth the hassle.

sirosen · 2025-01-13T20:12:46Z

100% understood! I think that's a more than fair response.

I was trying to get my hands dirty with some valid improvement before trying to help in other ways, not put an extra burden on you. I'll keep an eye out for ways I can help where "the juice is worth the squeeze", so to speak.

Thanks (as always) for the awesome tools!

EDIT: Oh, and I guess I should also say that I'm pretty sure that the string usage matches \w, which is documented as [a-zA-Z0-9_], but I wrote letters, which is py2-only, when I meant ascii_letters! 🤦

davidhalter · 2025-01-14T15:12:19Z

\w is documented like

For Unicode (str) patterns:

    Matches Unicode word characters; this includes all Unicode alphanumeric characters (as defined by str.isalnum()), as well as the underscore (_).

    Matches [a-zA-Z0-9_] if the ASCII flag is used.

Which in this case includes a big amount of unicode letters.

sirosen · 2025-01-14T15:34:59Z

You're 100% right. I was being clumsy in my thinking and assumed that identifiers are ascii-only. But they are not.

davidhalter closed this Jan 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make minor effeciency changes to regex usages #1133

Make minor effeciency changes to regex usages #1133

sirosen commented Jan 12, 2025

davidhalter commented Jan 13, 2025

sirosen commented Jan 13, 2025 •

edited

Loading

davidhalter commented Jan 14, 2025

sirosen commented Jan 14, 2025

Make minor effeciency changes to regex usages #1133

Make minor effeciency changes to regex usages #1133

Conversation

sirosen commented Jan 12, 2025

davidhalter commented Jan 13, 2025

sirosen commented Jan 13, 2025 • edited Loading

davidhalter commented Jan 14, 2025

sirosen commented Jan 14, 2025

sirosen commented Jan 13, 2025 •

edited

Loading