Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

each_recursive is extremely slow #134

Closed
makenowjust opened this issue May 31, 2024 · 2 comments · Fixed by #139
Closed

each_recursive is extremely slow #134

makenowjust opened this issue May 31, 2024 · 2 comments · Fixed by #139

Comments

@makenowjust
Copy link
Contributor

REXML::Node#each_recursive is a fundamental operation to traverse XML nodes. In particular, CSS selector uses this method heavily.

Unfortunately, this method is extremely slow. The following Gist contains a tiny (but mostly equivalent) each_recursive implementation and their benchmarks.

https://gist.github.com/makenowjust/b4852a59e53f0c85c740818c75303d5e

And, the below is a result of this script on my laptop (Apple M1 Pro, 14 inch, 32 GB) and Ruby 3.3.2.

$ ruby bench.rb
       user     system      total        real
   each_recursive  2.421999   0.008742   2.430741 (  2.453438)
my_each_recursive  0.057647   0.000049   0.057696 (  0.058718)
$ ruby --yjit bench.rb
       user     system      total        real
   each_recursive  1.613353   0.008227   1.621580 (  1.625374)
my_each_recursive  0.025493   0.000124   0.025617 (  0.025748)

Yes, REXML's each_recursive is ~50x (or ~80x with YJIT) slower than my tiny implementation.

I believe REXML does a lot of extra work, and we can make it faster.

@kou
Copy link
Member

kou commented May 31, 2024

Let's improve performance of it:

  1. Add a benchmark to https://github.com/ruby/rexml/tree/master/benchmark
  2. Improve performance step by step like we do for parser performance

BTW, are you interesting in merging the CSS selector feature to REXML itself?

@makenowjust
Copy link
Contributor Author

@kou Thank you. I will try to improve its performance.

I am also interested in merging the CSS selector feature to REXML.
But, there are some problems and concerns:

  • rexml-css_selector implementation uses the latest Ruby features. We need to downgrade the source (or use a transpiler much like ruby-next.)
  • This implementation is generic. Actually, it has a Prism adapter.
  • Implementation is still incomplete, and it will be changed with high frequency. Therefore, I want to place it where I can get to change it quickly.

I will create a separate issue on this matter.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Development

Successfully merging a pull request may close this issue.

2 participants