Skip to content

Short-circuit Rc/Arc equality checking on equal pointers where T: Eq #42965

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
wants to merge 1 commit into from

Conversation

joliss
Copy link
Contributor

@joliss joliss commented Jun 29, 2017

Closes #42655


Some notes:

  • Testing this would perhaps be a bunch more involved than the production code itself, so I went without tests for now.
  • I'm not sure if comparison functions should short-circuit. For example, rc1 < rc2 could return false if they point to them same object where T: Eq. It's perhaps a less clear-cut case than equality checking, so I decided to leave it out. We can always add it later if somebody can make a good argument.
  • I found it rather difficult to work on this because touch src/liballoc/rc.rs && ./x.py test -i src/liballoc --test-args rc:: takes 15 minutes (it recompiles a whole bunch of things after liballoc.) Am I doing something wrong?

@rust-highfive
Copy link
Contributor

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @sfackler (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@sfackler
Copy link
Member

@rust-lang/libs Thoughts on this? It is an interesting semantic change. Conditioning the behavior on Eq seems correct to me but I'd be interested if there are concerns around semantic changes in side-effectful implementations.

@sfackler sfackler added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Jun 30, 2017
@alexcrichton
Copy link
Member

I'd personally be wary of having this implementation, I feel like using Eq just barely gets around the nan/float problem, and in general this is such a core type it may run into problems in practice.

@joliss did you have some particular benchmarks and/or applications in mind that would benefit from this?

It's also worth pointing out that we have a similar optimization for slices, but it's only done on types which can be considered bytewise equal, not for all types.

@ranma42
Copy link
Contributor

ranma42 commented Jun 30, 2017

I assumed that the Eq requirement is to ensure that the relation is reflexive (as specified in the documentation for Eq).

@joliss
Copy link
Contributor Author

joliss commented Jun 30, 2017

I'd personally be wary of having this implementation, I feel like using Eq just barely gets around the nan/float problem, and in general this is such a core type it may run into problems in practice.

What problems are you envisioning?

@joliss did you have some particular benchmarks and/or applications in mind that would benefit from this?

Here's our use case: We have a complex struct with several heap-allocated fields (strings and vectors of other structs), so comparing it expensive. We're using persistent data structure libraries like hamt-rs or im-rs, which use Arc/Rc internally. Their eq implementations don't usually short-circuit on pointer equality. I've been tempted to send pull requests, but the short-circuiting logic arguably belongs in Arc/Rc itself, not in libraries building on top of it.

@shepmaster shepmaster added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jun 30, 2017
@sfackler
Copy link
Member

sfackler commented Jul 1, 2017

@alexcrichton I wouldn't say that the Eq bound barely gets around the issue - one of its basic requirements is a == a which is all this optimization assumes.

@alexcrichton
Copy link
Member

@joliss thanks for explaining the use case! I think that the question though sounds like it still should be punted to the standard library becuase hamt-rs and the like seem like they'd just face similar questions as to whether this is a valid optimization or not.

I'm not worried about anything concrete per-se but you could imagine implementations of Eq which have weird internal mutability or something like that. The point brought up by @sfackler though is a good one in that these concretely are the same instance in memory, and the reflexivity property of Eq requires it to be true. That being said then if you have a buggy Eq implementation that could lead to really weird bugs like Rc::new(x) being equal to itself but x isn't equal to itself.

That line of reasoning can get sort of interesting because it's memory safe to have buggy implementations so we'd want to make sure nothing bad can happen here. I don't think this is a problem though because you can already have a buggy Eq implementation so no one can rely on the properties of Eq for memory safety anyway.

In any case, I'd personally still find the location of the implementation here a little odd. For example if we feel that we should apply such an optimization to Rc then we should do so for &T as well, right? And similarly for &[T]? The implementations for Rc and Arc could delegate to these internal ones to avoid duplication of code.

As a final point, we currently avoid having default on public-facing impls, so for the specialization here could you make a private trait which has a default method to override instead?

@aturon
Copy link
Member

aturon commented Jul 3, 2017

Thanks for the PR, @joliss!

In any case, I'd personally still find the location of the implementation here a little odd. For example if we feel that we should apply such an optimization to Rc then we should do so for &T as well, right? And similarly for &[T]? The implementations for Rc and Arc could delegate to these internal ones to avoid duplication of code.

The difference was addressed in the original issue:

It is conceivable that in rare cases the equality-check on the inner value is so cheap that attempting to short-circuit adds more overhead than it saves. However, my sense is that Rc is usually used on types that are too complex to clone cheaply, and so it stands to reason that in most cases, the equality check on the inner value is expensive. (This is certainly true for the use cases I've encountered.) Checking equality on the pointers furthermore is very cheap, since we're dereferencing them anyway.

I agree with this line of reasoning; the tradeoffs around &T are quite different.

This strikes me as similar to a change we made elsewhere that essentially assumed the PartialOrd and Ord impls did the same thing. Since Eq is just a marker about total equality, I feel quite comfortable making this change, personally.

@alexcrichton
Copy link
Member

Hm I'm not sure that makes sense to me though? If &T == &T is too expensive to do pointer checks, isn't the same going to be true for Rc<T> == Rc<T>?

@aturon
Copy link
Member

aturon commented Jul 3, 2017

@alexcrichton

The key sentence is this one:

However, my sense is that Rc is usually used on types that are too complex to clone cheaply, and so it stands to reason that in most cases, the equality check on the inner value is expensive.

The point is that, given an arbitrary T, working with &T doesn't tell you much about whether a pointer check or full equality check is likely to pay off. For one thing, there's no reason to think that references have much of a chance of pointing to the exact same object. And for some T, you'd be better off just doing the comparison.

For your average Rc<T>, though, it's much likelier that:

  • The things your comparing actually point to the same object internally.
  • A pointer comparison is cheaper than a full comparison, due to objects behind an Rc usually being "large" in some sense.

The point is that, while there's not a huge semantic difference between Rc<T> and &T for comparisons (i.e., they're both just "pointers"), there's a substantial difference in common real-world usage that dictates different tradeoffs around optimizations like this one.

@joliss
Copy link
Contributor Author

joliss commented Jul 3, 2017

As a final point, we currently avoid having default on public-facing impls, so for the specialization here could you make a private trait which has a default method to override instead?

I'm not sure I understand the underlying problem, but I'm happy to try and change my PR.

However, I'm struggling with the development workflow on the Rust core repo, as I haven't been able to get incremental compilation to work. I tried ./x.py test -i src/liballoc --test-args rc:: but it takes 15 minutes for me. This makes it nearly impossible to work on the code. Can somebody point me in the right direction?

@cengiz-io
Copy link
Contributor

Hello @joliss

Are you using a config.toml for bootstrap?

You can find the example at https://github.com/rust-lang/rust/blob/master/src/bootstrap/config.toml.example

You should configure it (it has extensive comments inside) and place it near x.py

Let's see if that will improve it a bit

@nagisa
Copy link
Member

nagisa commented Jul 4, 2017

@joliss try --stage=0 or --stage=1 --keep-stage=0.

@alexcrichton
Copy link
Member

@aturon sure yeah I don't disagree with any of that, but I'm worried about the gotchas here outside of "standard usage of Rc". For example here's a scenario that I would find very surprising. Rc<T> == Rc<T> being cheap does not imply &T == &T will be cheap. If an application relies on this optimization then it would have to never use the &T version (which to me seems like it'd be much more common in generic code? We also give no recourse to actually verify this.

Similarly it seems claimed that the Eq for &T implementation to do pointer checks is too expensive. What if there's an application that uses Rc<T> for this type and finds the pointer checks too expensive as well?

What I think I may be getting at is that in many case the standard library tries to be simple to be easily understandable but more importantly predictable. We've certainly departed from this pretty heavily with the usage of specialization as small optimizations here and there (especially with iterators and collect), though.

I'm wary of opening the floodgates for any and all specialization-related optimizations being added to the standard library, so I feel like we need to draw the line somewhere. I'm not sure what side of the line this specific optimization is on and I may just be playing devil's advocate too much. You're not going to find a concrete reason to not land this from me, so that means we should just r+ this and land it when it's ready.

@carols10cents
Copy link
Member

carols10cents commented Jul 10, 2017

sooo do I hear an r+ from @aturon or @alexcrichton or @sfackler? :)

@aturon
Copy link
Member

aturon commented Jul 10, 2017

@carols10cents We're currently waiting on some fixes from the author. (Some realistic benchmark numbers would be welcome but not required). Otherwise we are good to go.

@aturon aturon added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 10, 2017
@carols10cents
Copy link
Member

Hi @joliss! Just a friendly ping -- do you need any help on this?

@alexcrichton
Copy link
Member

Closing due to inactivity to help clear out the queue, but please feel free to resubmit or ping us to reopen if comments are addressed!

@alexcrichton alexcrichton added S-inactive Status: Inactive and waiting on the author. This is often applied to closed PRs. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jul 20, 2017
@joliss
Copy link
Contributor Author

joliss commented Oct 30, 2017

Quick update: I remember there were some things we might or might not want to change about this, but I gave up on updating this PR because recompilation is incredibly slow for me and the disk space requirements blow the size of my development VM.

If anybody else wants to take over this patch, please have at it! (It might in fact be mergeable as is, but I'm not 100% sure.)

bors added a commit that referenced this pull request Dec 19, 2018
Short-circuit Rc/Arc equality checking on equal pointers where T: Eq

based on #42965

Is the use of the private trait ok this way? Is there anything else needed for this to get pulled?
Centril added a commit to Centril/rust that referenced this pull request May 11, 2019
add comment to `Rc`/`Arc`'s `Eq` specialization

in addition to rust-lang#56550

rust-lang#42965 (comment)
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
S-inactive Status: Inactive and waiting on the author. This is often applied to closed PRs. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants