Skip to content

Copy trait defeats generator optimization #62952

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Open
tmandry opened this issue Jul 24, 2019 · 2 comments
Open

Copy trait defeats generator optimization #62952

tmandry opened this issue Jul 24, 2019 · 2 comments
Labels
A-coroutines Area: Coroutines C-enhancement Category: An issue proposing an enhancement or a PR with one. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such I-heavy Issue: Problems and improvements with respect to binary size of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@tmandry
Copy link
Member

tmandry commented Jul 24, 2019

The following example (playground):

#![feature(generators, generator_trait)]

use std::ops::Generator;

#[derive(Clone)]
struct Foo([u8; 1024]);

fn overlap_foo() -> impl Generator<Yield = (), Return = ()>{
    static || {
        let x = Foo([0u8; 1024]);
        yield;
        drop(x);
        let y = Foo([0u8; 1024]);
        yield;
        drop(y);
    }
}

#[derive(Clone, Copy)]
struct Bar([u8; 1024]);

fn overlap_bar() -> impl Generator<Yield = (), Return = ()>{
    static || {
        let x = Bar([0u8; 1024]);
        yield;
        drop(x);
        let y = Bar([0u8; 1024]);
        yield;
        drop(y);
    }
}

fn main() {
    dbg!(std::mem::size_of_val(&overlap_foo()));
    dbg!(std::mem::size_of_val(&overlap_bar()));
}

Outputs:

[gen-explicit-drop.rs:34] std::mem::size_of_val(&overlap_foo()) = 1028                                                                                                                                                                                                                                                       
[gen-explicit-drop.rs:35] std::mem::size_of_val(&overlap_bar()) = 2052  

The only difference between Foo and Bar is that Foo is Copy. This defeats the generator optimization implemented in #61922 which considers any local that has been moved from as a candidate for overlap.

Technically, this is semantically consistent with the existing overlap behavior for generators. But making a type Copy and getting worse performance is a footgun that we should fix.

This can be fixed with a pass that turns unnecessary copies into moves in MIR.

@jonas-schievink jonas-schievink added A-coroutines Area: Coroutines C-enhancement Category: An issue proposing an enhancement or a PR with one. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jul 24, 2019
@Centril
Copy link
Contributor

Centril commented Jul 25, 2019

This can be fixed with a pass that turns unnecessary copies into moves in MIR.

I thought that's what https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/transform/copy_prop/index.html does?

cc @oli-obk

@Centril Centril added the I-heavy Issue: Problems and improvements with respect to binary size of generated code. label Jul 25, 2019
@tmandry
Copy link
Member Author

tmandry commented Jul 30, 2019

I thought that's what https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/transform/copy_prop/index.html does?

Nice find, this does indeed do the required transformation. Running that (along with simplify_locals) right before the generator transform almost does the trick.

The problem is that we currently eliminate all storage markers as part of copy_prop, and this defeats the generator optimization. If we merged the live ranges of the two variables instead, it should work.

(EDIT: simplify_locals is not needed)

@workingjubilee workingjubilee added the C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such label Oct 8, 2023
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
A-coroutines Area: Coroutines C-enhancement Category: An issue proposing an enhancement or a PR with one. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such I-heavy Issue: Problems and improvements with respect to binary size of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

4 participants