Skip to content

Add BinaryHeap::contains and BinaryHeap::remove #82002

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
wants to merge 5 commits into from
Closed

Add BinaryHeap::contains and BinaryHeap::remove #82002

wants to merge 5 commits into from

Conversation

billyrieger
Copy link
Contributor

Tracking issue: #82001.

@rust-highfive
Copy link
Contributor

r? @m-ou-se

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Feb 11, 2021
I think incorrect (or malicious) `PartialEq` and `PartialOrd` implementations for `T` could define `x == y` and `x < y` to both be conditionally true with interior mutability in such a way that `pos == len - 1` (implying `data[pos] == data[len - 1]`) but  we end up in the `data[pos] < data[len - 1]` branch.  `sift_up()` and `sift_down()` assume that `pos` is in-bounds which would be violated in this case.
@m-ou-se
Copy link
Member

m-ou-se commented Mar 3, 2021

Thanks for your PR.

I'm afraid that adding these functions will be misleading, as they do nothing special with the structure of a binary heap, making them O(n) just like on a Vec.

For contains, using .iter() to look for an item seems fine to me, and makes the time complexity a lot clearer. Maybe #82331 will help here by allowing .as_slice().contains(x). (Feel free to add a comment there about your use case.)

For remove, there's already .retain(|e| e != x), although that one doesn't stop the search after finding the element. Maybe a function to remove an item at a specific index would make sense in addition to #82331.

What do you think?

@Thomasdezeeuw
Copy link
Contributor

@m-ou-se I'm mostly interesting interest in the remove function from this pr. If @billyrieger could change it to take advantage of the internal layout, i.e that fact it is a max-heap, would such a change be acceptable?

@scottmcm
Copy link
Member

scottmcm commented Mar 4, 2021

Maybe remove-by-index could make sense? The heap can implement that efficiently, and the index could be found by .iter().position(p) since that's going to be linear complexity anyway...

@billyrieger
Copy link
Contributor Author

@m-ou-se I agree that contains is redundant/unhelpful but like you said there's currently no good analogue for remove. Adding a remove_at method or something similar along with as_slice would be a solid alternative though.

Initially I had an implementation that used the max-heap property to search for elements as @Thomasdezeeuw suggested but I'm not sure that it's any better than just iterating over the underlying data. The average- and worst-case time complexity is still the same, at least.

impl<T: Ord> BinaryHeap<T> {
    fn find(&self, item: &T) -> Option<usize> {
        self.find_recursive(item, 0)
    }

    fn find_recursive(&self, item: &T, pos: usize) -> Option<usize> {
        if pos >= self.data.len() {
            None
        } else {
            match self.data[pos].cmp(item) {
                // If `self.data[pos] == item`, return the item's position.
                Ordering::Equal => Some(pos),
                // If `self.data[pos] < item`, the item cannot be in either of
                // the child branches (because this is a max heap).
                Ordering::Less => None,
                // If `self.data[pos] > item`, we need to search both child branches.
                Ordering::Greater => {
                    let left_child = 2 * pos + 1;
                    let right_child = 2 * pos + 2;
                    self.find_recursive(item, left_child)
                        .or_else(|| self.find_recursive(item, right_child))
                }
            }
        }
    }
}

@billyrieger
Copy link
Contributor Author

the index could be found by .iter().position(p)

This relies on the fact that heap.iter() has the same ordering as the underlying Vec<T> which isn't necessarily the case (I believe the docs say the iteration order is unspecified). Adding as_slice would make this work (#82331).

@bors
Copy link
Collaborator

bors commented Mar 4, 2021

☔ The latest upstream changes (presumably #82756) made this pull request unmergeable. Please resolve the merge conflicts.

@crlf0710 crlf0710 added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 20, 2021
@crlf0710
Copy link
Member

crlf0710 commented Mar 20, 2021

Triage: there's merge conflicts now, and it would be best if you could start a discussion on Zulip to make progress on this.

@JohnCSimon
Copy link
Member

@billyrieger ping from triage: can you please fix the merge conflicts? Thank you.

@billyrieger
Copy link
Contributor Author

I'm going to close this - #82331 covers some of the same functionality and at this point it's unclear what the implementation details for BinaryHeap::remove should be.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants