Skip to content

Optimize insertion sort #40807

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Merged
merged 1 commit into from Mar 26, 2017
Merged

Optimize insertion sort #40807

merged 1 commit into from Mar 26, 2017

Conversation

ghost
Copy link

@ghost ghost commented Mar 24, 2017

This change slightly changes the main iteration loop so that LLVM can optimize it more efficiently.

Benchmark:

name                                   before ns/iter   after ns/iter    diff ns/iter   diff %
slice::sort_unstable_small_ascending   39 (2051 MB/s)   38 (2105 MB/s)             -1   -2.56%
slice::sort_unstable_small_big_random  579 (2210 MB/s)  575 (2226 MB/s)            -4   -0.69%
slice::sort_unstable_small_descending  80 (1000 MB/s)   70 (1142 MB/s)            -10  -12.50%
slice::sort_unstable_small_random      396 (202 MB/s)   386                       -10   -2.53%

The benchmark is not a fluke. I can see that performance on small_descending is consistently better after this change. I'm not 100% sure why this makes things faster, but my guess would be that v.len()+1 to the compiler looks like it could in theory overflow.

This change slightly changes the main iteration loop so that LLVM can
optimize it more efficiently.

Benchmark:

name                                   before ns/iter   after ns/iter    diff ns/iter   diff %
slice::sort_unstable_small_ascending   39 (2051 MB/s)   38 (2105 MB/s)             -1   -2.56%
slice::sort_unstable_small_big_random  579 (2210 MB/s)  575 (2226 MB/s)            -4   -0.69%
slice::sort_unstable_small_descending  80 (1000 MB/s)   70 (1142 MB/s)            -10  -12.50%
slice::sort_unstable_small_random      396 (202 MB/s)   386                       -10   -2.53%
@rust-highfive
Copy link
Contributor

r? @aturon

(rust_highfive has picked a reviewer for you, use r? to override)

@ghost
Copy link
Author

ghost commented Mar 24, 2017

r? @alexcrichton

@rust-highfive rust-highfive assigned alexcrichton and unassigned aturon Mar 24, 2017
@alexcrichton
Copy link
Member

@bors: r+

@bors
Copy link
Collaborator

bors commented Mar 25, 2017

📌 Commit 2c816f7 has been approved by alexcrichton

alexcrichton added a commit to alexcrichton/rust that referenced this pull request Mar 25, 2017
…=alexcrichton

Optimize insertion sort

This change slightly changes the main iteration loop so that LLVM can optimize it more efficiently.

Benchmark:

```
name                                   before ns/iter   after ns/iter    diff ns/iter   diff %
slice::sort_unstable_small_ascending   39 (2051 MB/s)   38 (2105 MB/s)             -1   -2.56%
slice::sort_unstable_small_big_random  579 (2210 MB/s)  575 (2226 MB/s)            -4   -0.69%
slice::sort_unstable_small_descending  80 (1000 MB/s)   70 (1142 MB/s)            -10  -12.50%
slice::sort_unstable_small_random      396 (202 MB/s)   386                       -10   -2.53%
```

The benchmark is not a fluke. I can see that performance on `small_descending` is consistently better after this change. I'm not 100% sure why this makes things faster, but my guess would be that `v.len()+1` to the compiler looks like it could in theory overflow.
bors added a commit that referenced this pull request Mar 25, 2017
Rollup of 11 pull requests

- Successful merges: #40347, #40501, #40516, #40524, #40540, #40642, #40683, #40764, #40778, #40807, #40809
- Failed merges: #40771
bors added a commit that referenced this pull request Mar 25, 2017
Rollup of 11 pull requests

- Successful merges: #40347, #40501, #40516, #40524, #40540, #40642, #40683, #40764, #40778, #40807, #40809
- Failed merges: #40771
@nagisa
Copy link
Member

nagisa commented Mar 25, 2017

Since this is internal to libstd/core, could you check whether the inclusive range syntax makes things better as well?

i.e. 2 ... v.len()

@ghost
Copy link
Author

ghost commented Mar 25, 2017

@nagisa Inclusive range syntax makes performance slightly worse, actually...

With the old insertion sort and with inclusive range syntax there's a bound check at the beginning of insertion sort. This PR removes the bound check.

If you want to play with this, here's a playpen link.

@bors
Copy link
Collaborator

bors commented Mar 25, 2017

⌛ Testing commit 2c816f7 with merge 04e47d7...

@arielb1
Copy link
Contributor

arielb1 commented Mar 25, 2017

frewsxcv added a commit to frewsxcv/rust that referenced this pull request Mar 25, 2017
…=alexcrichton

Optimize insertion sort

This change slightly changes the main iteration loop so that LLVM can optimize it more efficiently.

Benchmark:

```
name                                   before ns/iter   after ns/iter    diff ns/iter   diff %
slice::sort_unstable_small_ascending   39 (2051 MB/s)   38 (2105 MB/s)             -1   -2.56%
slice::sort_unstable_small_big_random  579 (2210 MB/s)  575 (2226 MB/s)            -4   -0.69%
slice::sort_unstable_small_descending  80 (1000 MB/s)   70 (1142 MB/s)            -10  -12.50%
slice::sort_unstable_small_random      396 (202 MB/s)   386                       -10   -2.53%
```

The benchmark is not a fluke. I can see that performance on `small_descending` is consistently better after this change. I'm not 100% sure why this makes things faster, but my guess would be that `v.len()+1` to the compiler looks like it could in theory overflow.
frewsxcv added a commit to frewsxcv/rust that referenced this pull request Mar 25, 2017
…=alexcrichton

Optimize insertion sort

This change slightly changes the main iteration loop so that LLVM can optimize it more efficiently.

Benchmark:

```
name                                   before ns/iter   after ns/iter    diff ns/iter   diff %
slice::sort_unstable_small_ascending   39 (2051 MB/s)   38 (2105 MB/s)             -1   -2.56%
slice::sort_unstable_small_big_random  579 (2210 MB/s)  575 (2226 MB/s)            -4   -0.69%
slice::sort_unstable_small_descending  80 (1000 MB/s)   70 (1142 MB/s)            -10  -12.50%
slice::sort_unstable_small_random      396 (202 MB/s)   386                       -10   -2.53%
```

The benchmark is not a fluke. I can see that performance on `small_descending` is consistently better after this change. I'm not 100% sure why this makes things faster, but my guess would be that `v.len()+1` to the compiler looks like it could in theory overflow.
bors added a commit that referenced this pull request Mar 26, 2017
Rollup of 7 pull requests

- Successful merges: #40642, #40734, #40740, #40771, #40807, #40820, #40821
- Failed merges:
@bors bors merged commit 2c816f7 into rust-lang:master Mar 26, 2017
@ghost ghost deleted the optimize-insertion-sort branch March 26, 2017 16:54
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants