Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

chore: refactor IndexedCrate::new in preparation for parallelizing #405

Merged

Conversation

jalil-salame
Copy link
Contributor

Modify imperative code into iterator code to make it easier to parallelize with rayon.

No perf increase:

IndexedCrate/new(aws-sdk-ec2)
                        time:   [1.3491 s 1.3505 s 1.3520 s]
                        change: [-0.8260% -0.7120% -0.6178%] (p = 0.00 < 0.05)
                        Change within noise threshold.

Split off from #402 (perf numbers come from the commits there)

Copy link
Owner

@obi1kenobi obi1kenobi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting this together and splitting up the PR, I really appreciate it!

I have a few minor comments, otherwise this is good to merge!

Comment on lines 54 to 57
let (min, max) = iter.size_hint();
// Reserving this space doesn't give a measurable gain, but neither does it hurt so we are
// keeping it around
let mut map = Self::with_capacity(max.unwrap_or(min));
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this doesn't produce measurable gain, I'd prefer to not include it for complexity and maintenance reasons.

Given that reserving space sometimes seems to make performance worse, I think it's a risk with larger downside than upside. We don't have enough test infrastructure set up to monitor for and prevent performance regressions, so we should minimize the risk they happen unnoticed.

I like the comment below that says "we tried reserving space and this was the result." Would you mind switching to something similar here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem!

/// getting the [`IndexedCrate::publicly_importable_names`] over the items in the index.
fn build_imports_index<'a>(
index: &'a HashMap<Id, Item>,
value: &IndexedCrate<'a>,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This &IndexedCrate<'a> is actually not fully constructed at the time of this call. That is a potential footgun that could waste a few hours in a future iteration of tweaking this code, and I'd prefer to avoid introducing it in the first place.

Would you mind inlining this method back into IndexedCrate::new() where it's much more clear that the value is not fully constructed yet?

It might even be good to add an explicit comment about how it's not fully constructed at that point of use, to explain why extracting that complexity into its own method isn't desirable. In retrospect, that isn't obvious and I totally understand why you moved it here as a result 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed!

Comment on lines +129 to +102
let iter = index.iter();
iter.filter_map(|(id, item)| {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mentioned in a parallel PR that as a personal preference thing, I find code easier to read if it creates and immediately consumes iterators instead of going through an intermediate variable.

If you don't mind, I'd appreciate it if you could tweak the code here and elsewhere in this PR in that manner, to help me maintain this code with less work in the long run:

Suggested change
let iter = index.iter();
iter.filter_map(|(id, item)| {
index.iter().filter_map(|(id, item)| {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We introduce the variable back once we parallelize it with rayon:

#[cfg(feature = "rayon")]
let iter = iter.par_iter();
#[cfg(not(feature = "rayon"))]
let iter = iter.iter();

I also prefer it the other way around.

I can switch it around as then reverting the rayon feature would leave the code in a better state.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah that explains it, thank you.

I agree, adding the variable in the rayon-specific PR would be ideal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed!

@obi1kenobi
Copy link
Owner

Btw I just made a Discord server for cargo-semver-checks and Trustfall, to make coordination a bit easier. You're welcome to join if you'd like: https://discord.gg/s7Cfmp3P

Modify imperative code into iterator code to make it easier to
paralellize with rayon.

No perf increase:

```
IndexedCrate/new(aws-sdk-ec2)
                        time:   [1.3491 s 1.3505 s 1.3520 s]
                        change: [-0.8260% -0.7120% -0.6178%] (p = 0.00 < 0.05)
                        Change within noise threshold.
```
@jalil-salame jalil-salame force-pushed the refactor-indexed-crate branch from 0339033 to 560c66e Compare August 27, 2024 19:36
src/indexed_crate.rs Outdated Show resolved Hide resolved
@obi1kenobi obi1kenobi enabled auto-merge (squash) August 28, 2024 03:24
@obi1kenobi obi1kenobi disabled auto-merge August 28, 2024 03:26
@obi1kenobi obi1kenobi enabled auto-merge (squash) August 28, 2024 03:26
@obi1kenobi obi1kenobi merged commit 359ffa0 into obi1kenobi:rustdoc-v33 Aug 28, 2024
5 checks passed
obi1kenobi added a commit that referenced this pull request Aug 28, 2024
)

* chore: refactor IndexedCrate::new in preparation for parallelizing

Modify imperative code into iterator code to make it easier to
paralellize with rayon.

No perf increase:

```
IndexedCrate/new(aws-sdk-ec2)
                        time:   [1.3491 s 1.3505 s 1.3520 s]
                        change: [-0.8260% -0.7120% -0.6178%] (p = 0.00 < 0.05)
                        Change within noise threshold.
```

* Update src/indexed_crate.rs

---------

Co-authored-by: Predrag Gruevski <2348618+obi1kenobi@users.noreply.github.com>
obi1kenobi added a commit that referenced this pull request Aug 28, 2024
)

* chore: refactor IndexedCrate::new in preparation for parallelizing

Modify imperative code into iterator code to make it easier to
paralellize with rayon.

No perf increase:

```
IndexedCrate/new(aws-sdk-ec2)
                        time:   [1.3491 s 1.3505 s 1.3520 s]
                        change: [-0.8260% -0.7120% -0.6178%] (p = 0.00 < 0.05)
                        Change within noise threshold.
```

* Update src/indexed_crate.rs

---------

Co-authored-by: Predrag Gruevski <2348618+obi1kenobi@users.noreply.github.com>
obi1kenobi added a commit that referenced this pull request Aug 28, 2024
)

* chore: refactor IndexedCrate::new in preparation for parallelizing

Modify imperative code into iterator code to make it easier to
paralellize with rayon.

No perf increase:

```
IndexedCrate/new(aws-sdk-ec2)
                        time:   [1.3491 s 1.3505 s 1.3520 s]
                        change: [-0.8260% -0.7120% -0.6178%] (p = 0.00 < 0.05)
                        Change within noise threshold.
```

* Update src/indexed_crate.rs

---------

Co-authored-by: Predrag Gruevski <2348618+obi1kenobi@users.noreply.github.com>
obi1kenobi added a commit that referenced this pull request Aug 28, 2024
)

* chore: refactor IndexedCrate::new in preparation for parallelizing

Modify imperative code into iterator code to make it easier to
paralellize with rayon.

No perf increase:

```
IndexedCrate/new(aws-sdk-ec2)
                        time:   [1.3491 s 1.3505 s 1.3520 s]
                        change: [-0.8260% -0.7120% -0.6178%] (p = 0.00 < 0.05)
                        Change within noise threshold.
```

* Update src/indexed_crate.rs

---------

Co-authored-by: Predrag Gruevski <2348618+obi1kenobi@users.noreply.github.com>
obi1kenobi added a commit that referenced this pull request Aug 28, 2024
) (#410)

* chore: refactor IndexedCrate::new in preparation for parallelizing

Modify imperative code into iterator code to make it easier to
paralellize with rayon.

No perf increase:

```
IndexedCrate/new(aws-sdk-ec2)
                        time:   [1.3491 s 1.3505 s 1.3520 s]
                        change: [-0.8260% -0.7120% -0.6178%] (p = 0.00 < 0.05)
                        Change within noise threshold.
```

* Update src/indexed_crate.rs

---------

Co-authored-by: Jalil David Salamé Messina <60845989+jalil-salame@users.noreply.github.com>
obi1kenobi added a commit that referenced this pull request Aug 28, 2024
) (#412)

* chore: refactor IndexedCrate::new in preparation for parallelizing

Modify imperative code into iterator code to make it easier to
paralellize with rayon.

No perf increase:

```
IndexedCrate/new(aws-sdk-ec2)
                        time:   [1.3491 s 1.3505 s 1.3520 s]
                        change: [-0.8260% -0.7120% -0.6178%] (p = 0.00 < 0.05)
                        Change within noise threshold.
```

* Update src/indexed_crate.rs

---------

Co-authored-by: Jalil David Salamé Messina <60845989+jalil-salame@users.noreply.github.com>
obi1kenobi added a commit that referenced this pull request Aug 28, 2024
) (#413)

* chore: refactor IndexedCrate::new in preparation for parallelizing

Modify imperative code into iterator code to make it easier to
paralellize with rayon.

No perf increase:

```
IndexedCrate/new(aws-sdk-ec2)
                        time:   [1.3491 s 1.3505 s 1.3520 s]
                        change: [-0.8260% -0.7120% -0.6178%] (p = 0.00 < 0.05)
                        Change within noise threshold.
```

* Update src/indexed_crate.rs

---------

Co-authored-by: Jalil David Salamé Messina <60845989+jalil-salame@users.noreply.github.com>
obi1kenobi added a commit that referenced this pull request Aug 28, 2024
) (#411)

* chore: refactor IndexedCrate::new in preparation for parallelizing

Modify imperative code into iterator code to make it easier to
paralellize with rayon.

No perf increase:

```
IndexedCrate/new(aws-sdk-ec2)
                        time:   [1.3491 s 1.3505 s 1.3520 s]
                        change: [-0.8260% -0.7120% -0.6178%] (p = 0.00 < 0.05)
                        Change within noise threshold.
```

* Update src/indexed_crate.rs

---------

Co-authored-by: Jalil David Salamé Messina <60845989+jalil-salame@users.noreply.github.com>
@jalil-salame jalil-salame deleted the refactor-indexed-crate branch August 28, 2024 05:49
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants