-
-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
chore: refactor IndexedCrate::new in preparation for parallelizing #405
chore: refactor IndexedCrate::new in preparation for parallelizing #405
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for putting this together and splitting up the PR, I really appreciate it!
I have a few minor comments, otherwise this is good to merge!
src/indexed_crate.rs
Outdated
let (min, max) = iter.size_hint(); | ||
// Reserving this space doesn't give a measurable gain, but neither does it hurt so we are | ||
// keeping it around | ||
let mut map = Self::with_capacity(max.unwrap_or(min)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this doesn't produce measurable gain, I'd prefer to not include it for complexity and maintenance reasons.
Given that reserving space sometimes seems to make performance worse, I think it's a risk with larger downside than upside. We don't have enough test infrastructure set up to monitor for and prevent performance regressions, so we should minimize the risk they happen unnoticed.
I like the comment below that says "we tried reserving space and this was the result." Would you mind switching to something similar here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem!
src/indexed_crate.rs
Outdated
/// getting the [`IndexedCrate::publicly_importable_names`] over the items in the index. | ||
fn build_imports_index<'a>( | ||
index: &'a HashMap<Id, Item>, | ||
value: &IndexedCrate<'a>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This &IndexedCrate<'a>
is actually not fully constructed at the time of this call. That is a potential footgun that could waste a few hours in a future iteration of tweaking this code, and I'd prefer to avoid introducing it in the first place.
Would you mind inlining this method back into IndexedCrate::new()
where it's much more clear that the value
is not fully constructed yet?
It might even be good to add an explicit comment about how it's not fully constructed at that point of use, to explain why extracting that complexity into its own method isn't desirable. In retrospect, that isn't obvious and I totally understand why you moved it here as a result 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed!
let iter = index.iter(); | ||
iter.filter_map(|(id, item)| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mentioned in a parallel PR that as a personal preference thing, I find code easier to read if it creates and immediately consumes iterators instead of going through an intermediate variable.
If you don't mind, I'd appreciate it if you could tweak the code here and elsewhere in this PR in that manner, to help me maintain this code with less work in the long run:
let iter = index.iter(); | |
iter.filter_map(|(id, item)| { | |
index.iter().filter_map(|(id, item)| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We introduce the variable back once we parallelize it with rayon:
#[cfg(feature = "rayon")]
let iter = iter.par_iter();
#[cfg(not(feature = "rayon"))]
let iter = iter.iter();
I also prefer it the other way around.
I can switch it around as then reverting the rayon feature would leave the code in a better state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah that explains it, thank you.
I agree, adding the variable in the rayon
-specific PR would be ideal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed!
Btw I just made a Discord server for cargo-semver-checks and Trustfall, to make coordination a bit easier. You're welcome to join if you'd like: https://discord.gg/s7Cfmp3P |
Modify imperative code into iterator code to make it easier to paralellize with rayon. No perf increase: ``` IndexedCrate/new(aws-sdk-ec2) time: [1.3491 s 1.3505 s 1.3520 s] change: [-0.8260% -0.7120% -0.6178%] (p = 0.00 < 0.05) Change within noise threshold. ```
0339033
to
560c66e
Compare
) * chore: refactor IndexedCrate::new in preparation for parallelizing Modify imperative code into iterator code to make it easier to paralellize with rayon. No perf increase: ``` IndexedCrate/new(aws-sdk-ec2) time: [1.3491 s 1.3505 s 1.3520 s] change: [-0.8260% -0.7120% -0.6178%] (p = 0.00 < 0.05) Change within noise threshold. ``` * Update src/indexed_crate.rs --------- Co-authored-by: Predrag Gruevski <2348618+obi1kenobi@users.noreply.github.com>
) * chore: refactor IndexedCrate::new in preparation for parallelizing Modify imperative code into iterator code to make it easier to paralellize with rayon. No perf increase: ``` IndexedCrate/new(aws-sdk-ec2) time: [1.3491 s 1.3505 s 1.3520 s] change: [-0.8260% -0.7120% -0.6178%] (p = 0.00 < 0.05) Change within noise threshold. ``` * Update src/indexed_crate.rs --------- Co-authored-by: Predrag Gruevski <2348618+obi1kenobi@users.noreply.github.com>
) * chore: refactor IndexedCrate::new in preparation for parallelizing Modify imperative code into iterator code to make it easier to paralellize with rayon. No perf increase: ``` IndexedCrate/new(aws-sdk-ec2) time: [1.3491 s 1.3505 s 1.3520 s] change: [-0.8260% -0.7120% -0.6178%] (p = 0.00 < 0.05) Change within noise threshold. ``` * Update src/indexed_crate.rs --------- Co-authored-by: Predrag Gruevski <2348618+obi1kenobi@users.noreply.github.com>
) * chore: refactor IndexedCrate::new in preparation for parallelizing Modify imperative code into iterator code to make it easier to paralellize with rayon. No perf increase: ``` IndexedCrate/new(aws-sdk-ec2) time: [1.3491 s 1.3505 s 1.3520 s] change: [-0.8260% -0.7120% -0.6178%] (p = 0.00 < 0.05) Change within noise threshold. ``` * Update src/indexed_crate.rs --------- Co-authored-by: Predrag Gruevski <2348618+obi1kenobi@users.noreply.github.com>
) (#410) * chore: refactor IndexedCrate::new in preparation for parallelizing Modify imperative code into iterator code to make it easier to paralellize with rayon. No perf increase: ``` IndexedCrate/new(aws-sdk-ec2) time: [1.3491 s 1.3505 s 1.3520 s] change: [-0.8260% -0.7120% -0.6178%] (p = 0.00 < 0.05) Change within noise threshold. ``` * Update src/indexed_crate.rs --------- Co-authored-by: Jalil David Salamé Messina <60845989+jalil-salame@users.noreply.github.com>
) (#412) * chore: refactor IndexedCrate::new in preparation for parallelizing Modify imperative code into iterator code to make it easier to paralellize with rayon. No perf increase: ``` IndexedCrate/new(aws-sdk-ec2) time: [1.3491 s 1.3505 s 1.3520 s] change: [-0.8260% -0.7120% -0.6178%] (p = 0.00 < 0.05) Change within noise threshold. ``` * Update src/indexed_crate.rs --------- Co-authored-by: Jalil David Salamé Messina <60845989+jalil-salame@users.noreply.github.com>
) (#413) * chore: refactor IndexedCrate::new in preparation for parallelizing Modify imperative code into iterator code to make it easier to paralellize with rayon. No perf increase: ``` IndexedCrate/new(aws-sdk-ec2) time: [1.3491 s 1.3505 s 1.3520 s] change: [-0.8260% -0.7120% -0.6178%] (p = 0.00 < 0.05) Change within noise threshold. ``` * Update src/indexed_crate.rs --------- Co-authored-by: Jalil David Salamé Messina <60845989+jalil-salame@users.noreply.github.com>
) (#411) * chore: refactor IndexedCrate::new in preparation for parallelizing Modify imperative code into iterator code to make it easier to paralellize with rayon. No perf increase: ``` IndexedCrate/new(aws-sdk-ec2) time: [1.3491 s 1.3505 s 1.3520 s] change: [-0.8260% -0.7120% -0.6178%] (p = 0.00 < 0.05) Change within noise threshold. ``` * Update src/indexed_crate.rs --------- Co-authored-by: Jalil David Salamé Messina <60845989+jalil-salame@users.noreply.github.com>
Modify imperative code into iterator code to make it easier to parallelize with rayon.
No perf increase:
Split off from #402 (perf numbers come from the commits there)