Skip to content

Do not inline finish_grow #78682

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Merged
merged 1 commit into from
Dec 15, 2020
Merged

Do not inline finish_grow #78682

merged 1 commit into from
Dec 15, 2020

Conversation

glandium
Copy link
Contributor

@glandium glandium commented Nov 2, 2020

Fixes #78471.

Looking at libgkrust.a in Firefox, the sizes for the gkrust.*.o file is:

  • 18584816 (text) 582418 (data) with unmodified master
  • 17937659 (text) 582554 (data) with Tiny Vecs are dumb. #72227 reverted
  • 17968228 (text) 582858 (data) with #[inline(never)] on grow_amortized and grow_exact, but that has some performance consequences
  • 17927760 (text) 582322 (data) with this change

So in terms of size, at least in the case of Firefox, this patch more than undoes the regression. I don't think it should affect performance, but we'll see.

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Nov 2, 2020
@glandium
Copy link
Contributor Author

glandium commented Nov 2, 2020

If I look at libgkrust.a in Firefox, the .text size for the gkrust.*.o file is:

  • 19118203 with current master
  • 18402141 with Tiny Vecs are dumb. #72227 reverted
  • 18356853 with #[inline(never)] on grow_amortized and grow_exact
  • 18337894 with this patch

So in terms of size, at least in the case of Firefox, this patch more than undoes the regression. I don't think it should affect performance, but we'll see.

@jyn514 jyn514 added I-heavy Issue: Problems and improvements with respect to binary size of generated code. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Nov 2, 2020
@jyn514
Copy link
Member

jyn514 commented Nov 2, 2020

Would it make sense to instead mark grow_amortized and grow_exact as inline(never)?

@jyn514
Copy link
Member

jyn514 commented Nov 2, 2020

Oh I missed #78483, sorry.

@gparker42
Copy link

This looks like the right inlining boundary as intended by @nnethercote. finish_grow deliberately handles as much type-independent work as possible, so a single copy could be shared by all Vec types.

#78471 reported grow_amortized being inlined all the way back into callers of Vec::push. @glandium, do you see that in your libgkrust.a? That sounds unprofitable - I'd expect the call to RawVec::reserve to be another inlining boundary - but I couldn't reproduce the reported aggressive inlining in my own small tests.

@glandium
Copy link
Contributor Author

glandium commented Nov 3, 2020

It may only happen with -Clto.

@gparker42
Copy link

Ah, that looks like it: I need both -Clto and a sufficiently large inline-threshold (larger than the opt-level=3 default).

Can you build your libgkrust.a with an inlining boundary at RawVec::reserve? That reduces the Vec::push inlined size to around ten instructions. (I'm still getting set up for code size experiments here.)

@lcnr
Copy link
Contributor

lcnr commented Nov 3, 2020

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion

@bors
Copy link
Collaborator

bors commented Nov 3, 2020

⌛ Trying commit 068782448bdf657c6be701c14835d03e1d322c90 with merge 452835b56923f7ef0f8d102f1e96c939cec7af07...

@bors
Copy link
Collaborator

bors commented Nov 3, 2020

☀️ Try build successful - checks-actions
Build commit: 452835b56923f7ef0f8d102f1e96c939cec7af07 (452835b56923f7ef0f8d102f1e96c939cec7af07)

@rust-timer
Copy link
Collaborator

Queued 452835b56923f7ef0f8d102f1e96c939cec7af07 with parent 7b5a9e9, future comparison URL.

@glandium
Copy link
Contributor Author

glandium commented Nov 3, 2020

Ah, that looks like it: I need both -Clto and a sufficiently large inline-threshold (larger than the opt-level=3 default).

It happens in Firefox with -Clto -Copt-level=2.

Can you build your libgkrust.a with an inlining boundary at RawVec::reserve? That reduces the Vec::push inlined size to around ten instructions. (I'm still getting set up for code size experiments here.)

That would likely make things slower, similarly to #78483.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit (452835b56923f7ef0f8d102f1e96c939cec7af07): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never
@rustbot modify labels: +S-waiting-on-review -S-waiting-on-perf

@gparker42
Copy link

Can you build your libgkrust.a with an inlining boundary at RawVec::reserve? That reduces the Vec::push inlined size to around ten instructions. (I'm still getting set up for code size experiments here.)

That would likely make things slower, similarly to #78483.

I'm betting that reserve can be a more profitable inlining boundary than grow_amortized was, but I'll tinker with that on my own.

@nnethercote
Copy link
Contributor

I'm betting that reserve can be a more profitable inlining boundary than grow_amortized was, but I'll tinker with that on my own.

I tried a number of things in and around reserve, results were sensitive and unpredictable and I couldn't get improvements.

This patch seems fine, though. I'm a little surprised I didn't never-inline finish_grow myself.

@bors r+

@bors
Copy link
Collaborator

bors commented Nov 6, 2020

📌 Commit 068782448bdf657c6be701c14835d03e1d322c90 has been approved by nnethercote

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 6, 2020
@bors
Copy link
Collaborator

bors commented Nov 6, 2020

⌛ Testing commit 068782448bdf657c6be701c14835d03e1d322c90 with merge 1052db144c002891ffbd4517ea0071a222b52070...

@bors
Copy link
Collaborator

bors commented Nov 6, 2020

💔 Test failed - checks-actions

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Nov 6, 2020
@glandium
Copy link
Contributor Author

glandium commented Nov 6, 2020

/checkout/src/test/codegen/vec-iter-collect-len.rs:9:16: error: CHECK-NOT: excluded string found in input
 // CHECK-NOT: call
               ^
/checkout/obj/build/i686-unknown-linux-gnu/test/codegen/vec-iter-collect-len/vec-iter-collect-len.ll:134:2: note: found here
 call void @llvm.lifetime.start.p0i8(i64 12, i8* nonnull %0)
 ^~~~

How can one work around that?

@jyn514
Copy link
Member

jyn514 commented Nov 12, 2020

$ git log --oneline src/test/codegen/vec-iter-collect-len.rs
2a663555ddf Remove licenses
1001b2beee5 inore some codegen tests when debug assertions are enabled
f67453729c1 std: Ensure OOM is classified as `nounwind`
$ git show f67453729c1 src/test/codegen/vec-iter-collect-len.rs
commit f67453729c19b435686c94936d8145051e7f1284
Author: Alex Crichton <alex@alexcrichton.com>
Date:   Thu May 24 12:03:05 2018 -0700

    std: Ensure OOM is classified as `nounwind`
    
    OOM can't unwind today, and historically it's been optimized as if it can't
    unwind. This accidentally regressed with recent changes to the OOM handler, so
    this commit adds in a codegen test to assert that everything gets optimized away
    after the OOM function is approrpiately classified as nounwind
    
    Closes #50925


added: src/test/codegen/vec-iter-collect-len.rs
────────────────────────────────────────────────────────────────────────────────────

1
// Copyright 2017 The Rust Project Developers. See the COPYRIGHT
// file at the top-level directory of this distribution and at
// http://rust-lang.org/COPYRIGHT.
//
// Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or
// http://www.apache.org/licenses/LICENSE-2.0> or the MIT license
// <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your
// option. This file may not be copied, modified, or distributed
// except according to those terms.

// no-system-llvm
// compile-flags: -O
#![crate_type="lib"]

#[no_mangle]
pub fn get_len() -> usize {
    // CHECK-LABEL: @get_len
    // CHECK-NOT: call
    // CHECK-NOT: invoke
    [1, 2, 3].iter().collect::<Vec<_>>().len()
}

@alexcrichton do you understand why this change could have broken the test?

@glandium
Copy link
Contributor Author

glandium commented Dec 8, 2020

FTR, I already validated that it still optimizes the collect().len() case from vec-iter-collect-len.rs.

@glandium
Copy link
Contributor Author

glandium commented Dec 9, 2020

Should we queue another round of perf tests for the SpecFromIterNested change?

@jyn514
Copy link
Member

jyn514 commented Dec 9, 2020

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion

@bors
Copy link
Collaborator

bors commented Dec 9, 2020

⌛ Trying commit 76bd145 with merge 1618e13494fc48a6834838c2d90011e29f0434ea...

@bors
Copy link
Collaborator

bors commented Dec 9, 2020

☀️ Try build successful - checks-actions
Build commit: 1618e13494fc48a6834838c2d90011e29f0434ea (1618e13494fc48a6834838c2d90011e29f0434ea)

@rust-timer
Copy link
Collaborator

Queued 1618e13494fc48a6834838c2d90011e29f0434ea with parent c16d52d, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit (1618e13494fc48a6834838c2d90011e29f0434ea): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never
@rustbot modify labels: +S-waiting-on-review -S-waiting-on-perf

@jyn514
Copy link
Member

jyn514 commented Dec 10, 2020

Wow, that is a really large savings in max-rss 🎉

@the8472
Copy link
Member

the8472 commented Dec 10, 2020

max-rss is really noisy, compare to the noise run which shows similarly-sized fluctuations on other crates.

@lcnr
Copy link
Contributor

lcnr commented Dec 14, 2020

i'm a bit out of the loop here, sry

It looks like it is fine to land this PR as is rn and I would be fine with landing if #78682 (comment) is still up to date. It might even make sense to move that summary into the PR description so that it's added to the merge commit.

@bors delegate+

@bors
Copy link
Collaborator

bors commented Dec 14, 2020

✌️ @glandium can now approve this pull request

@glandium
Copy link
Contributor Author

I'll get new numbers and paste them in the PR message. I've never used bors myself, do I @-it with "rollup" when I'm done?

@jyn514
Copy link
Member

jyn514 commented Dec 14, 2020

@glandium you would say @bors r=lcnr rollup=never. rollup=always is only for PRs that don't affect perf or are trivial, like fixing typos in comments.

@bors
Copy link
Collaborator

bors commented Dec 14, 2020

📌 Commit 76bd145 has been approved by lcnr

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 14, 2020
@jyn514
Copy link
Member

jyn514 commented Dec 14, 2020

damn it bors

@bors r-

@bors bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Dec 14, 2020
@glandium
Copy link
Contributor Author

@bors r=lcnr rollup=never

@bors
Copy link
Collaborator

bors commented Dec 15, 2020

📌 Commit 76bd145 has been approved by lcnr

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Dec 15, 2020
@bors
Copy link
Collaborator

bors commented Dec 15, 2020

⌛ Testing commit 76bd145 with merge e261649...

@bors
Copy link
Collaborator

bors commented Dec 15, 2020

☀️ Test successful - checks-actions
Approved by: lcnr
Pushing e261649 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Dec 15, 2020
@bors bors merged commit e261649 into rust-lang:master Dec 15, 2020
@rustbot rustbot added this to the 1.50.0 milestone Dec 15, 2020
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
I-heavy Issue: Problems and improvements with respect to binary size of generated code. merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Code bloat from RawVec::grow_amortized