Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

adapter: make all DurableTimestampOracle methods async #22091

Merged

Conversation

aljoscha
Copy link
Contributor

@aljoscha aljoscha commented Oct 2, 2023

We do this as preparation for introducing a TimestampOracle trait, which we eventually want to back by RPC or calls to an external consensus system, both of which require async'ness.

This also requires making the call stack up to timestamp oracle methods async.

Part of MaterializeInc/database-issues#6635

Tips for reviewer

This is potentially one of the more spicy commits because it adds async to some coordinator methods, and the @MaterializeInc/adapter team might have stronger opinions on that.

#21671 has this commit and all the follow-up commits for full MaterializeInc/database-issues#6635, it might be good to look at that for context.

Checklist

  • This PR has adequate test coverage / QA involvement has been duly considered.
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
  • This PR includes the following user-facing behavior changes:

We do this as preparation for introducing a TimestampOracle trait, which
we eventually want to back by RPC or calls to an external consensus
system, both of which require async'ness.

This also requires making the call stack up to timestamp oracle methods
async.
@aljoscha aljoscha added the T-platform-v2 Theme: Platform v2 label Oct 2, 2023
@aljoscha aljoscha requested a review from a team as a code owner October 2, 2023 09:37
@aljoscha aljoscha requested a review from maddyblue October 2, 2023 09:37
@shepherdlybot
Copy link

shepherdlybot bot commented Oct 2, 2023

This PR has higher risk. Make sure to carefully review the file hotspots. In addition to having a knowledgeable reviewer, it may be useful to add observability and/or a feature flag. What's This?

Risk Score Probability Buggy File Hotspots
🔴 80 / 100 59% 3
Buggy File Hotspots:
File Percentile
../sequencer/inner.rs 99
../src/coord.rs 99
../coord/ddl.rs 94

@nrainer-materialize nrainer-materialize self-requested a review October 2, 2023 09:38
@nrainer-materialize
Copy link
Contributor

Let me trigger a Nightly and a coverage build once the tests pipeline is green.

@aljoscha
Copy link
Contributor Author

aljoscha commented Oct 2, 2023

Let me trigger a Nightly and a coverage build once the tests pipeline is green.

I think Sheperdly becoming aware of this has at least tiggered nighlies already. Unless that was you. 😅

@nrainer-materialize
Copy link
Contributor

Let me trigger a Nightly and a coverage build once the tests pipeline is green.

I think Sheperdly becoming aware of this has at least tiggered nighlies already. Unless that was you. 😅

It's the CI 😀
It triggers nightly if you touch certain files (e.g., build definitions or test framework code).

Copy link
Contributor

@jkosh44 jkosh44 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I'm OK with all the added asyncs.

schedule_storage_usage_collection is the only one where I feel it's a bit unfortunate that it's become async. It's meant to be obviously cheap so we don't accidentally start doing part of the actual collection work in that method. We could pass a timestamp into the method but, I don't really feel that strongly and I think getting the timestamp inside the method is a bit clearer.

@nrainer-materialize
Copy link
Contributor

I had triggered this coverage build (#244).
It looks good, everything is at least covered by a unit test. (Note that the await keyword is shown as uncovered because it is not properly supported by the coverage collection step: rust-lang/rust#98712)

@aljoscha
Copy link
Contributor Author

aljoscha commented Oct 2, 2023

I had triggered this coverage build (#244). It looks good, everything is at least covered by a unit test. (Note that the await keyword is shown as uncovered because it is not properly supported by the coverage collection step: rust-lang/rust#98712)

Thank you for that additional context! How worried should we be about those nightly failures (I saw you restarted a couple jobs?). I don't know how healthy nightly currently is.

@nrainer-materialize
Copy link
Contributor

Thank you for that additional context! How worried should we be about those nightly failures (I saw you restarted a couple jobs?). I don't know how healthy nightly currently is.

We are good, Nightly is green. Some jobs in Nightly flicker a bit, I triggered a retry of "Parallel Workload (cancel)". I don't think that this is related to the changes, I think I have seen that elsewhere too.

As for coverage build, failures are rather common. Tests with coverage enabled need more memory and take significantly longer. Consequently, they are more likely to run into timeouts.

@aljoscha aljoscha merged commit 7cbbdde into MaterializeInc:main Oct 4, 2023
@aljoscha aljoscha deleted the adapter-asyncify-timestamp-oracle branch October 4, 2023 09:03
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
T-platform-v2 Theme: Platform v2
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants