-
-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Question: PyO3 wrapped snappy, slower than python-snappy #34
Comments
Thanks for reaching out. I'm always happy to take a look at performance concerns. I peaked at your binding code (and As far as your benchmarks go, while they do appear to be comparing apples-to-apples, your benchmarks are deeply biased. Not only are you only benchmarking a single data set, but you're benchmarking a pathological one: a file that is the same byte repeated. It's entirely plausible that the reference implementation is better optimized for that case while not being faster in all cases. You can actually run benchmarks directly against both the C++ reference implementation and the Rust implementation. Moreover, the benchmarks consist of many different types of data commonly found in the wild. You can run them like so:
And to view a nice comparative output, you'll want to
As you can see, the reference implementation has the edge in a number of cases, but not all and not by much. The reference implementation has likely grown some optimizations over the years since I wrote this implementation in Rust. Anyone is welcome to find them and port them to Rust code. I don't think I'll be doing it myself any time soon because as far as I can tell, I'd still call this library "on par" with the C++ implementation. For your case, I'd recommend your next step to be making your benchmark less biased. The best route would probably be to reproduce the benchmarks used by this crate, and then run them on your end. If you see vastly different results comparatively, then perhaps there is something wrong in the binding code somewhere. But if you see results comparatively consistent with the above, then the mystery is solved: the reference implementation just happens to be quite a bit better on one pathological case. |
Thank you for such a full reply! I hope I did not elude that this disparity was in any way the fault of Indeed, I've failed in the implementation of the benchmark data; great that you pointed that out and how best to take action on it. I'll be sure to reference this crate's benchmarks and make mine less biased. I knew something must have been on my end as demonstrated by the thorough benchmarks in this crate, so I'm happy to have your opinion on it now. Can't thank you enough for your thoughtful response and again with all the work you do in Rust. 👏 |
Hey @BurntSushi I added a PR, it seems it was indeed my failure in generating better data. Would you mind looking it over and specifically, if it was okay to copy the data used in this crate? I did add the Thanks again! |
@milesgranger I took the data from the reference implementation. I've also taken ideas from the reference implementation, which is why this crate uses the same license as the reference implementation. Other snappy implementations, like the one in Go, also use the benchmark data. |
Hi there,
First, thanks for a lovely lib, both here and all your other contributions in Rust! 💯
As a learning exercise I've made a Python lib wrapping various de/compression libs in Rust: cramjam and in running the benchmarks for the Snappy comparison, it appears slower than python-snappy by a fair margin (granted both are quite quick), however, given your benchmarks I was expecting they'd be a bit closer.
Here I expose the interface and was curious if you would be able to give me any pointers on things I may have missed.
Appreciate the help. 😃
The text was updated successfully, but these errors were encountered: