Improve sift_down performance in BinaryHeap #81127

hanmertens · 2021-01-17T17:55:03Z

Replacing child < end - 1 with child <= end.saturating_sub(2) in BinaryHeap::sift_down_range (surprisingly) results in a significant speedup of BinaryHeap::into_sorted_vec. The same substitution can be done for BinaryHeap::sift_down_to_bottom, which causes a slight but probably statistically insignificant speedup for BinaryHeap::pop. It's interesting that benchmarks aside from bench_into_sorted_vec are barely affected, even those that do use sift_down_* methods internally.

Benchmark	Before (ns/iter)	After (ns/iter)	Speedup
bench_find_smallest_1000¹	392,617	385,200	1.02
bench_from_vec¹	506,016	504,444	1.00
bench_into_sorted_vec¹	476,869	384,458	1.24
bench_peek_mut_deref_mut³	518,753	519,792	1.00
bench_pop²	446,718	444,409	1.01
bench_push³	772,481	770,208	1.00

¹: internally calls sift_down_range
²: internally calls sift_down_to_bottom
³: should not be affected

rust-highfive · 2021-01-17T17:55:06Z

r? @shepmaster

(rust-highfive has picked a reviewer for you, use r? to override)

…e, r=Mark-Simulacrum Document BinaryHeap unsafe functions `BinaryHeap` contains some private safe functions but that are actually unsafe to call. This PR marks them `unsafe` and documents all the `unsafe` function calls inside them. While doing this I might also have found a bug: some "SAFETY" comments in `sift_down_range` and `sift_down_to_bottom` are valid only if you assume that `child` doesn't overflow. However it may overflow if `end > isize::MAX` which can be true for ZSTs (but I think only for them). I guess the easiest fix would be to skip any sifting if `mem::size_of::<T> == 0`. Probably conflicts with rust-lang#81127 but solving the eventual merge conflict should be pretty easy.

bors · 2021-02-21T15:08:34Z

☔ The latest upstream changes (presumably #82359) made this pull request unmergeable. Please resolve the merge conflicts.

Because child > 0, the two statements are equivalent, but using saturating_sub and <= yields in faster code. This is most notable in the binary_heap::bench_into_sorted_vec benchmark, which shows a speedup of 1.26x, which uses sift_down_range internally. The speedup of pop (that uses sift_down_to_bottom internally) is much less significant as the sifting method is not called in a loop.

shepmaster · 2021-03-08T14:44:47Z

r? @dtolnay

dtolnay

Thanks @hanmertens. I am not familiar with the BinaryHeap implementation but I am prepared to accept this on the basis of the into_sorted_vec benchmark. I confirmed that the behavior is logically identical to before in both places.

If end >= 2 then child < end - 1 is equivalent to child <= end - 2 is equivalent to child <= end.saturating_sub(2).
If end == 1 then child < end - 1 is false while child <= end.saturating_sub(2) is equivalent to child == 0. However it's a loop invariant that child == 2 * hole.pos() + 1 > 0 so child != 0 and child <= end.saturating_sub(2) is also false.
If end == 0 then contradiction because it's a precondition of the function that 0 <= pos < end, and end is not mutated.

dtolnay · 2021-03-09T08:04:10Z

@bors r+

bors · 2021-03-09T08:04:12Z

📌 Commit 095bf01 has been approved by dtolnay

…rf, r=dtolnay Improve sift_down performance in BinaryHeap Replacing `child < end - 1` with `child <= end.saturating_sub(2)` in `BinaryHeap::sift_down_range` (surprisingly) results in a significant speedup of `BinaryHeap::into_sorted_vec`. The same substitution can be done for `BinaryHeap::sift_down_to_bottom`, which causes a slight but probably statistically insignificant speedup for `BinaryHeap::pop`. It's interesting that benchmarks aside from `bench_into_sorted_vec` are barely affected, even those that do use `sift_down_*` methods internally. | Benchmark | Before (ns/iter) | After (ns/iter) | Speedup | |--------------------------|------------------|-----------------|---------| | bench_find_smallest_10001 | 392,617 | 385,200 | 1.02 | | bench_from_vec1 | 506,016 | 504,444 | 1.00 | | bench_into_sorted_vec1 | 476,869 | 384,458 | 1.24 | | bench_peek_mut_deref_mut3 | 518,753 | 519,792 | 1.00 | | bench_pop2 | 446,718 | 444,409 | 1.01 | | bench_push3 | 772,481 | 770,208 | 1.00 | 1: internally calls `sift_down_range` 2: internally calls `sift_down_to_bottom` 3: should not be affected

Rollup of 8 pull requests Successful merges: - rust-lang#81127 (Improve sift_down performance in BinaryHeap) - rust-lang#81879 (Added #[repr(transparent)] to core::cmp::Reverse) - rust-lang#82048 (or-patterns: disallow in `let` bindings) - rust-lang#82731 (Bump libc dependency of std to 0.2.88.) - rust-lang#82799 (Add regression test for rust-lang#75525) - rust-lang#82841 (Change x64 size checks to not apply to x32.) - rust-lang#82883 (Update Cargo) - rust-lang#82887 (Update CONTRIBUTING.md) Failed merges: r? `@ghost` `@rustbot` modify labels: rollup

rust-highfive assigned shepmaster Jan 17, 2021

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jan 17, 2021

SkiFire13 mentioned this pull request Feb 3, 2021

Document BinaryHeap unsafe functions #81706

Merged

JohnCSimon added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 7, 2021

hanmertens force-pushed the binary_heap_sift_down_perf branch from bdf2962 to 095bf01 Compare February 21, 2021 15:43

rust-highfive assigned dtolnay and unassigned shepmaster Mar 8, 2021

dtolnay approved these changes Mar 9, 2021

View reviewed changes

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 9, 2021

This was referenced Mar 9, 2021

Rollup of 8 pull requests #82928

Closed

Rollup of 8 pull requests #82929

Merged

bors merged commit c013dc0 into rust-lang:master Mar 9, 2021

rustbot added this to the 1.52.0 milestone Mar 9, 2021

hanmertens deleted the binary_heap_sift_down_perf branch March 9, 2021 12:52

clint-white mentioned this pull request Jul 2, 2022

The scope of the unsafe block can be appropriately reduced sekineh/binary-heap-plus-rs#32

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve sift_down performance in BinaryHeap #81127

Improve sift_down performance in BinaryHeap #81127

hanmertens commented Jan 17, 2021

rust-highfive commented Jan 17, 2021

bors commented Feb 21, 2021

shepmaster commented Mar 8, 2021

dtolnay left a comment

dtolnay commented Mar 9, 2021

bors commented Mar 9, 2021

Improve sift_down performance in BinaryHeap #81127

Improve sift_down performance in BinaryHeap #81127

Conversation

hanmertens commented Jan 17, 2021

rust-highfive commented Jan 17, 2021

bors commented Feb 21, 2021

shepmaster commented Mar 8, 2021

dtolnay left a comment

Choose a reason for hiding this comment

dtolnay commented Mar 9, 2021

bors commented Mar 9, 2021