Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

GC runtime is O(allocations), so it eventually dominates runtime #3209

Open
saethlin opened this issue Dec 4, 2023 · 2 comments
Open

GC runtime is O(allocations), so it eventually dominates runtime #3209

saethlin opened this issue Dec 4, 2023 · 2 comments
Labels
A-interpreter Area: affects the core interpreter C-enhancement Category: a PR with an enhancement or an issue tracking an accepted enhancement I-slow Impact: Makes Miri even slower than it already is

Comments

@saethlin
Copy link
Member

saethlin commented Dec 4, 2023

I have been looking into the discussion we had on #3194. I stuck some code into Miri that prints the fraction of runtime spent in the GC then turned it loose on the regex test suite. nextest is great for this because we get a new interpreter for each test. Some of the very long-running tests spend up to 80% of their runtime in the GC.

After some experimentation, I think the root cause is that the Provenance GC runs on a basic block interval, but walks the entire allocation map every time it is run.

Here's a very silly program that exhibits explosive GC work:

fn main() {
    let layout = std::alloc::Layout::from_size_align(1, 1).unwrap();
    for _ in 0..100_000 {
        unsafe {
            let _ = std::alloc::alloc(layout);
        }
    }
}

My instrumentation indicates this program (which runs in 8 seconds) with the default GC interval spends 27% of its runtime in the GC. Increasing the number of allocations drives that fraction towards 100%. Just bumping it up by a factor of 10 jumps the runtime to 300 seconds and the fraction of time in GC to 80%. A million cold allocations is quite a few, but even with 32-byte buckets that's 32 MB of heap. Hardly a memory-hungry program.

We get explosive runtime because while we're scanning all of memory there's only garbage in a tiny fraction of it. This is the foundation of generational garbage collectors as far as I'm aware (I know precious little about GCs).

I feel like a generational approach is the best way out of this problem. I'm starting to think over this and it might be workable? I'm concerned about the general complexity increase of splitting memory into multiple HashMaps, but if all the complexity can be buried in MonoHashMap or our AllocMap implementation, maybe it's manageable.

@RalfJung
Copy link
Member

RalfJung commented Dec 5, 2023

Even with a generational GC we have to scan the long-term heap at some point, don't we? A full GC run is fundamentally O(live-allocations); that's one reason why hazard pointers or similar approaches can deliver better performance.

We could adjust the GC interval to the size of the heap or so; not sure if that makes sense.

My instrumentation indicates this program (which runs in 8 seconds) with the default GC interval spends 27% of its runtime in the GC.

How long does it run without the GC?

@saethlin
Copy link
Member Author

saethlin commented Dec 6, 2023

How long does it run without the GC?

About 6.5 seconds (close to what the 27% would suggest, but not exactly).

We could adjust the GC interval to the size of the heap or so; not sure if that makes sense.

I considered this, but I think this would severely penalize execution patterns like slice-get-unchecked that really benefit from the GC running often, because it is preventing algorithmic runaway in a different part of the code. Maybe I'll try this anyway just to see what the effect is on the regex unit tests.

@RalfJung RalfJung added C-enhancement Category: a PR with an enhancement or an issue tracking an accepted enhancement A-interpreter Area: affects the core interpreter I-slow Impact: Makes Miri even slower than it already is labels Apr 18, 2024
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
A-interpreter Area: affects the core interpreter C-enhancement Category: a PR with an enhancement or an issue tracking an accepted enhancement I-slow Impact: Makes Miri even slower than it already is
Projects
None yet
Development

No branches or pull requests

2 participants