Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

fix(spool): Revert to page counts for size estimate #3379

Merged
merged 4 commits into from
Apr 8, 2024
Merged

Conversation

jjbayer
Copy link
Member

@jjbayer jjbayer commented Apr 5, 2024

INC-703 showed that estimate_spool_size via dbstat is too slow. This PR replaces the estimate with the number of non-free pages multiplied by the page size. Though inaccurate, benchmarks show that this is fast regardless of the db size.

Fixes https://github.com/getsentry/team-ingest/issues/307.

@@ -1569,26 +1569,25 @@ mod tests {
"buffer.writes:1|c",
"buffer.envelopes_written:3|c",
"buffer.envelopes_disk_count:3|g",
"buffer.disk_size:1031|h",
"buffer.disk_size:24576|h",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty far off. But the relative error should be lower for larger disk buffers.

@@ -74,7 +74,9 @@ pub fn delete<'a>(key: QueueKey) -> Query<'a, Sqlite, SqliteArguments<'a>> {
///
/// This info used to calculate the current allocated database size.
pub fn current_size<'a>() -> Query<'a, Sqlite, SqliteArguments<'a>> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update the docs for this function and the calling function?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 fun fact: The doc on the calling function was already correct because we never updated it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite, it didn't subtract the free pages.

relay-server/src/services/spooler/sql.rs Outdated Show resolved Hide resolved
Comment on lines +1950 to +1954
// Result:
// pgsize = 2'408'464'463 t.elapsed() = 7.007533833s
// pgsize_agg = 2'408'464'463 t.elapsed() = 5.010104791s
// brute_force = 1'750'000'000 t.elapsed() = 7.893590875s
// pragma = 3'036'307'456 t.elapsed() = 213.417µs
Copy link
Member

@Dav1dde Dav1dde Apr 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! But it over estimates by quite a bit > 1 GiB?

Edit: To clarify I think that's okay, just surprising.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep the estimates are pretty far off, this is something we will have to keep in mind.

@jjbayer jjbayer marked this pull request as ready for review April 5, 2024 13:49
@jjbayer jjbayer requested a review from a team as a code owner April 5, 2024 13:49
Copy link
Member

@Dav1dde Dav1dde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is changelog worthy, in my PR I had:

**Bug Fixes**:

- Fix performance regression in the spooler. ([#3378](https://github.com/getsentry/relay/pull/3378))

@jjbayer
Copy link
Member Author

jjbayer commented Apr 8, 2024

As discussed with @olksdr: Because the spool size estimation is also used to decide whether or not to transition back to memory mode, and the new computation over-estimates the spool size, this PR might require a follow-up if we notice that Relay gets stuck in disk mode.

With the performance regression fixed, getting stuck in disk mode should not be a big problem performance-wise. We could even consider operating always in disk mode to simplify the state machine.

@jjbayer jjbayer merged commit b7b1cab into master Apr 8, 2024
20 checks passed
@jjbayer jjbayer deleted the fix/spool-pragma branch April 8, 2024 09:14
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants