Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Rewrite durable object alarm deduplication with cancellation considerations #726

Merged
merged 1 commit into from
Jun 6, 2023

Conversation

bcaimano
Copy link
Contributor

@bcaimano bcaimano commented Jun 2, 2023

This commit does a few notable things:

  • The WorkerEntrypoint now invokes context.run() and incomingRequest.drain() on the promise chain for the runAlarm() request. This means that we are able to report to the actor when that request is cancelled, which implies that the worker entrypoint will be going away. (There is an explicit guard for request cancellation now.)
  • The Worker::Actor now has two functions for alarms. getAlarm() checks if there is an existing alarm for a given time synchronously. scheduleAlarm() provides a promise that will resolve when it is time to run the alarm or the alarm has been cancelled. It also cancels the currently scheduled alarm if there is one.
  • Durable object alarms in the sandbox can now be "scheduled" or "running" instead of "running" or "queued". This is meaningful because "running" alarms cannot be cancelled and have associated metrics events but "scheduled" alarms can be cancelled and have no metrics events.

Taken together, these changes attempt to ensure that there is a live worker entrypoint associated with each scheduled or running alarm. This can be especially important because other worker entrypoints may be waiting upon the result from that alarm.

@bcaimano bcaimano force-pushed the bcaimano/run-alarm branch 2 times, most recently from 86484be to 8cc0ee6 Compare June 2, 2023 16:49
Copy link
Member

@a-robinson a-robinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd feel a lot better about @xortive reviewing this than me, but this LGTM

}

KJ_IF_MAYBE(scheduledAlarm, impl->maybeScheduledAlarm) {
// We had a previously scheduled alarm, let's cancel it.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if it was for the same time? Do we really want to cancel it even in that case or would we be better off just adding a branch to its result promise?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the only caller of scheduleAlarm() calls getAlarm() in the same synchronous scope, essentially doing what you describe. I'll add the comment in the header!


co_return co_await whenReady.then([this]() -> ScheduleAlarmResult {
// It's time to run! Let's tear apart the scheduled alarm and make a running alarm.
auto scheduledAlarm = KJ_ASSERT_NONNULL(kj::mv(impl->maybeScheduledAlarm));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we don't need to verify that impl->maybeScheduledAlarm still contains the expected scheduledTime? Because whenCanceled would definitely be resolved first (before whenReady) if scheduledAlarm had been removed/changed?

That looks like it's probably true, I just haven't convinced myself it -- so mostly I'm asking for reassurance that you have convinced yourself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, you have it right, but that deserves a comment!

@bcaimano bcaimano force-pushed the bcaimano/run-alarm branch from 8cc0ee6 to 3ff03a5 Compare June 5, 2023 17:22
…ations

This commit does a few notable things:
- The WorkerEntrypoint now invokes `context.run()` and `incomingRequest.drain()` on the promise chain for the `runAlarm()` request. This means that we are able to report to the actor when that request is cancelled, which implies that the worker entrypoint will be going away. (There is an explicit guard for request cancellation now.)
- The Worker::Actor now has two functions for alarms. `getAlarm()` checks if there is an existing alarm for a given time synchronously. `scheduleAlarm()` provides a promise that will resolve when it is time to run the alarm or the alarm has been cancelled. It also cancels the currently scheduled alarm if there is one.
- Durable object alarms in the sandbox can now be "scheduled" or "running" instead of "running" or "queued". This is meaningful because "running" alarms cannot be cancelled and have associated metrics events but "scheduled" alarms can be cancelled and have no metrics events.

Taken together, these changes attempt to ensure that there is a live worker entrypoint associated with each scheduled or running alarm. This can be especially important because other worker entrypoints may be waiting upon the result from that alarm.
@bcaimano bcaimano force-pushed the bcaimano/run-alarm branch from 3ff03a5 to 1008d6c Compare June 5, 2023 17:27
@bcaimano bcaimano merged commit 169bf6f into main Jun 6, 2023
@bcaimano bcaimano deleted the bcaimano/run-alarm branch June 6, 2023 18:04
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants