-
Notifications
You must be signed in to change notification settings - Fork 13.4k
incr.comp.: Use a set implementation optimized for small item counts for deduplicating read-edges. #45577
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Conversation
…for deduplicating read-edges.
Something like this? https://crates.io/crates/vec_map |
@leonardo-m No, this structure is more like https://docs.rs/david-set/0.1.2/david_set/struct.Set.html. |
// Many kinds of nodes often only have between 0 and 3 edges, so we provide a | ||
// specialized set implementation that does not allocate for those some counts. | ||
#[derive(Debug, PartialEq, Eq)] | ||
enum DepNodeIndexSet { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you move this to a generic data structure in rustc_data_structures
?
Could you try and get some numbers on the performance improvement or the hit rate? Even if you don't, r=me with the set moved to |
@bors try Preparing for perf. |
incr.comp.: Use a set implementation optimized for small item counts for deduplicating read-edges. Many kinds of `DepNodes` will only ever have between zero and three edges originating from them (see e.g. #45063 (comment)) so let's try to avoid allocating a `HashSet` in those cases. r? @nikomatsakis
☀️ Test successful - status-travis |
@rust-lang/infra perf check requested from #45577 (comment). |
This is fine, but an Note: ArrayVec is already in the codebase. |
The improvement/regression is negligible. There is even a +17.4% max-rss (memory use) in the
|
Thanks for kicking off the performance measurement, @kennytm!
@julian-seward1 and I ran into this function as on of the hotter ones while profiling a incremental building of the regex crate. The compiler spends roughly 0.9% and 1.6% of cycles in the The main intended optimization here is to get rid of the heap allocation for small hash sets. It's a bit surprising that the effect on the overall instruction count is so small. I would have hoped for -0.5% instead of -0.1%.
This must be some other effect. I don't see how this could introduce any noticeable increase in memory consumption. Thanks for your comments, everyone! Maybe I'll tinker some more with this in my spare time. Closing for now. |
@michaelwoerister Probably it's because jemalloc is very efficient? 😄 |
Many kinds of
DepNodes
will only ever have between zero and three edges originating from them (see e.g. #45063 (comment)) so let's try to avoid allocating aHashSet
in those cases.r? @nikomatsakis