-
Notifications
You must be signed in to change notification settings - Fork 13.4k
Avoid hashing more than is strictly necessary, in the compiler. #56308
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Comments
I have a branch which adds functions for interning based on this PR. I suggest that further work be based on that. cc @nnethercote |
@Zoxc That is, change the implementation of the |
Sure. I think we have to change things to use a |
Oh, another place where we could take advantage of computing the hash only once is the query engine, in case we had to compute a query, and we're inserting the result. |
Another note, not directly related to interners: We have some things in the compiler that already contain high-quality hash values of themselves. E.g. it's unnecessary work to hash a rust/src/librustc/ty/context.rs Line 919 in 691a7f8
There are probably a few additional, similar cases. |
@michaelwoerister I think we can maybe use a wrapper for the hash state that uses specialization to avoid double-hashing, in that case, instead of relying on |
I think the raw_entry API would be the nicer way to do it. It seems to be tailored to support a case just like this (among other use cases). |
@michaelwoerister Yes but it would require making an entire wrapper of the |
What would your solution look like? |
Use raw_entry for more efficient interning Fixes rust-lang#56308 (comment)
I'll reopen this because the discussion mentioned things other than interners. Also, I still have to write those code examples for @michaelwoerister. |
Avoid rehashing Fingerprint as a map key This introduces a no-op `Unhasher` for map keys that are already hash- like, for example `Fingerprint` and its wrapper `DefPathHash`. For these we can directly produce the `u64` hash for maps. The first use of this is `def_path_hash_to_def_id: Option<UnhashMap<DefPathHash, DefId>>`. cc rust-lang#56308 r? @eddyb
The interners currently do use the raw entry APIs:
|
We should still look for other uses of hash maps with keys that are already hashes |
Use UnhashMap for a few more maps This avoids a few cases of hashing data that's already hashed. cc rust-lang#56308
I ran some analysis to find the maps changed in #120076 by looking for things put into HashMaps that looked like they were already hashes, using a few heuristics (e.g., similar number of 1 and 0 bits). I think that found most or all of the problem places. I don't see a good way to assert we don't regress there -- with a TypeId for non 'static things, we could assert in Hash64/Hash128 etc impls that we're not hashing them with FxHash (or that we're only hashing them with Unhash). But that assert doesn't exist today, and I don't think adding e.g. specialization or a doc(hidden) method on Hasher in std makes sense for this at this time. I think the next big opportunity for reducing hashing would be maps with dense, relatively small integer valued keys, where we actually don't need to hash because they're sufficiently dense that a vec makes more sense. We saw big wins with #119977 from both (a) removing the hashing and (b) improved cache locality due to keeping things closer together. But those typically aren't as easy of a transformation to make, and to some extent are a time/space tradeoff. |
WIP: vendor rustc-hash and override str/length_prefix methods Locally avoiding length and str sentinel hashing adds 2,611,200 cases of hashes with just one add_to_hash call (out of roughly 43,063,296 total cases). Unclear what kind of effect this will have on runtime performance at this time, but seems like it's worth trying! cc rust-lang#56308
Use UnhashMap for a few more maps This avoids a few cases of hashing data that's already hashed. cc rust-lang#56308
WIP: vendor rustc-hash and override str/length_prefix methods Locally avoiding length and str sentinel hashing adds 2,611,200 cases of hashes with just one add_to_hash call (out of roughly 43,063,296 total cases). Unclear what kind of effect this will have on runtime performance at this time, but seems like it's worth trying! cc rust-lang#56308
Use UnhashMap for a few more maps This avoids a few cases of hashing data that's already hashed. cc rust-lang/rust#56308
Use UnhashMap for a few more maps This avoids a few cases of hashing data that's already hashed. cc rust-lang/rust#56308
Uh oh!
There was an error while loading. Please reload this page.
It should be possible to build cheaper interners than what we currently have, using
raw_entry
.cc @rust-lang/compiler @gankro @Amanieu
The text was updated successfully, but these errors were encountered: