Skip to content

Commit 788671c

Browse files
authored
Rollup merge of #106997 - Sp00ph:introselect, r=scottmcm
Add heapsort fallback in `select_nth_unstable` Addresses #102451 and #106933. `slice::select_nth_unstable` uses a quick select implementation based on the same pattern defeating quicksort algorithm that `slice::sort_unstable` uses. `slice::sort_unstable` uses a recursion limit and falls back to heapsort if there were too many bad pivot choices, to ensure O(n log n) worst case running time (known as introsort). However, `slice::select_nth_unstable` does not have such a fallback strategy, which leads to it having a worst case running time of O(n²) instead. #102451 links to a playground which generates pathological inputs that show this quadratic behavior. On my machine, a randomly generated slice of length `1 << 19` takes ~200µs to calculate its median, whereas a pathological input of the same length takes over 2.5s. This PR adds an iteration limit to `select_nth_unstable`, falling back to heapsort, which ensures an O(n log n) worst case running time (introselect). With this change, there was no noticable slowdown for the random input, but the same pathological input now takes only ~1.2ms. In the future it might be worth implementing something like Median of Medians or Fast Deterministic Selection instead, which guarantee O(n) running time for all possible inputs. I've left this as a `FIXME` for now and only implemented the heapsort fallback to minimize the needed code changes. I still think we should clarify in the `select_nth_unstable` docs that the worst case running time isn't currently O(n) (the original reason that #102451 was opened), but I think it's a lot better to be able to guarantee O(n log n) instead of O(n²) for the worst case.
2 parents f547bb5 + 273c6c3 commit 788671c

File tree

1 file changed

+22
-0
lines changed

1 file changed

+22
-0
lines changed

library/core/src/slice/sort.rs

+22
Original file line numberDiff line numberDiff line change
@@ -831,6 +831,15 @@ fn partition_at_index_loop<'a, T, F>(
831831
) where
832832
F: FnMut(&T, &T) -> bool,
833833
{
834+
// Limit the amount of iterations and fall back to heapsort, similarly to `slice::sort_unstable`.
835+
// This lowers the worst case running time from O(n^2) to O(n log n).
836+
// FIXME: Investigate whether it would be better to use something like Median of Medians
837+
// or Fast Deterministic Selection to guarantee O(n) worst case.
838+
let mut limit = usize::BITS - v.len().leading_zeros();
839+
840+
// True if the last partitioning was reasonably balanced.
841+
let mut was_balanced = true;
842+
834843
loop {
835844
// For slices of up to this length it's probably faster to simply sort them.
836845
const MAX_INSERTION: usize = 10;
@@ -839,6 +848,18 @@ fn partition_at_index_loop<'a, T, F>(
839848
return;
840849
}
841850

851+
if limit == 0 {
852+
heapsort(v, is_less);
853+
return;
854+
}
855+
856+
// If the last partitioning was imbalanced, try breaking patterns in the slice by shuffling
857+
// some elements around. Hopefully we'll choose a better pivot this time.
858+
if !was_balanced {
859+
break_patterns(v);
860+
limit -= 1;
861+
}
862+
842863
// Choose a pivot
843864
let (pivot, _) = choose_pivot(v, is_less);
844865

@@ -863,6 +884,7 @@ fn partition_at_index_loop<'a, T, F>(
863884
}
864885

865886
let (mid, _) = partition(v, pivot, is_less);
887+
was_balanced = cmp::min(mid, v.len() - mid) >= v.len() / 8;
866888

867889
// Split the slice into `left`, `pivot`, and `right`.
868890
let (left, right) = v.split_at_mut(mid);

0 commit comments

Comments
 (0)