-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Improve sift_down performance in BinaryHeap #81127
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Conversation
r? @shepmaster (rust-highfive has picked a reviewer for you, use r? to override) |
…e, r=Mark-Simulacrum Document BinaryHeap unsafe functions `BinaryHeap` contains some private safe functions but that are actually unsafe to call. This PR marks them `unsafe` and documents all the `unsafe` function calls inside them. While doing this I might also have found a bug: some "SAFETY" comments in `sift_down_range` and `sift_down_to_bottom` are valid only if you assume that `child` doesn't overflow. However it may overflow if `end > isize::MAX` which can be true for ZSTs (but I think only for them). I guess the easiest fix would be to skip any sifting if `mem::size_of::<T> == 0`. Probably conflicts with rust-lang#81127 but solving the eventual merge conflict should be pretty easy.
…e, r=Mark-Simulacrum Document BinaryHeap unsafe functions `BinaryHeap` contains some private safe functions but that are actually unsafe to call. This PR marks them `unsafe` and documents all the `unsafe` function calls inside them. While doing this I might also have found a bug: some "SAFETY" comments in `sift_down_range` and `sift_down_to_bottom` are valid only if you assume that `child` doesn't overflow. However it may overflow if `end > isize::MAX` which can be true for ZSTs (but I think only for them). I guess the easiest fix would be to skip any sifting if `mem::size_of::<T> == 0`. Probably conflicts with rust-lang#81127 but solving the eventual merge conflict should be pretty easy.
…e, r=Mark-Simulacrum Document BinaryHeap unsafe functions `BinaryHeap` contains some private safe functions but that are actually unsafe to call. This PR marks them `unsafe` and documents all the `unsafe` function calls inside them. While doing this I might also have found a bug: some "SAFETY" comments in `sift_down_range` and `sift_down_to_bottom` are valid only if you assume that `child` doesn't overflow. However it may overflow if `end > isize::MAX` which can be true for ZSTs (but I think only for them). I guess the easiest fix would be to skip any sifting if `mem::size_of::<T> == 0`. Probably conflicts with rust-lang#81127 but solving the eventual merge conflict should be pretty easy.
☔ The latest upstream changes (presumably #82359) made this pull request unmergeable. Please resolve the merge conflicts. |
Because child > 0, the two statements are equivalent, but using saturating_sub and <= yields in faster code. This is most notable in the binary_heap::bench_into_sorted_vec benchmark, which shows a speedup of 1.26x, which uses sift_down_range internally. The speedup of pop (that uses sift_down_to_bottom internally) is much less significant as the sifting method is not called in a loop.
bdf2962
to
095bf01
Compare
r? @dtolnay |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @hanmertens. I am not familiar with the BinaryHeap implementation but I am prepared to accept this on the basis of the into_sorted_vec benchmark. I confirmed that the behavior is logically identical to before in both places.
-
If
end >= 2
thenchild < end - 1
is equivalent tochild <= end - 2
is equivalent tochild <= end.saturating_sub(2)
. -
If
end == 1
thenchild < end - 1
is false whilechild <= end.saturating_sub(2)
is equivalent tochild == 0
. However it's a loop invariant thatchild == 2 * hole.pos() + 1 > 0
sochild != 0
andchild <= end.saturating_sub(2)
is also false. -
If
end == 0
then contradiction because it's a precondition of the function that0 <= pos < end
, andend
is not mutated.
@bors r+ |
📌 Commit 095bf01 has been approved by |
…rf, r=dtolnay Improve sift_down performance in BinaryHeap Replacing `child < end - 1` with `child <= end.saturating_sub(2)` in `BinaryHeap::sift_down_range` (surprisingly) results in a significant speedup of `BinaryHeap::into_sorted_vec`. The same substitution can be done for `BinaryHeap::sift_down_to_bottom`, which causes a slight but probably statistically insignificant speedup for `BinaryHeap::pop`. It's interesting that benchmarks aside from `bench_into_sorted_vec` are barely affected, even those that do use `sift_down_*` methods internally. | Benchmark | Before (ns/iter) | After (ns/iter) | Speedup | |--------------------------|------------------|-----------------|---------| | bench_find_smallest_1000<sup>1</sup> | 392,617 | 385,200 | 1.02 | | bench_from_vec<sup>1</sup> | 506,016 | 504,444 | 1.00 | | bench_into_sorted_vec<sup>1</sup> | 476,869 | 384,458 | 1.24 | | bench_peek_mut_deref_mut<sup>3</sup> | 518,753 | 519,792 | 1.00 | | bench_pop<sup>2</sup> | 446,718 | 444,409 | 1.01 | | bench_push<sup>3</sup> | 772,481 | 770,208 | 1.00 | <sup>1</sup>: internally calls `sift_down_range` <sup>2</sup>: internally calls `sift_down_to_bottom` <sup>3</sup>: should not be affected
Rollup of 8 pull requests Successful merges: - rust-lang#81127 (Improve sift_down performance in BinaryHeap) - rust-lang#81879 (Added #[repr(transparent)] to core::cmp::Reverse) - rust-lang#82048 (or-patterns: disallow in `let` bindings) - rust-lang#82731 (Bump libc dependency of std to 0.2.88.) - rust-lang#82799 (Add regression test for rust-lang#75525) - rust-lang#82841 (Change x64 size checks to not apply to x32.) - rust-lang#82883 (Update Cargo) - rust-lang#82887 (Update CONTRIBUTING.md) Failed merges: r? `@ghost` `@rustbot` modify labels: rollup
Replacing
child < end - 1
withchild <= end.saturating_sub(2)
inBinaryHeap::sift_down_range
(surprisingly) results in a significant speedup ofBinaryHeap::into_sorted_vec
. The same substitution can be done forBinaryHeap::sift_down_to_bottom
, which causes a slight but probably statistically insignificant speedup forBinaryHeap::pop
. It's interesting that benchmarks aside frombench_into_sorted_vec
are barely affected, even those that do usesift_down_*
methods internally.1: internally calls
sift_down_range
2: internally calls
sift_down_to_bottom
3: should not be affected