-
Notifications
You must be signed in to change notification settings - Fork 13.7k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
fix(core): Reduce risk of race condition during workflow activation loop #13186
Conversation
@@ -90,7 +90,6 @@ export class ActiveWorkflowManager { | |||
await this.addActiveWorkflows('init'); | |||
|
|||
await this.externalHooks.run('activeWorkflows.initialized'); | |||
await this.webhookService.populateCache(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are already populating the cache immediately after adding the webhooks for every activated workflow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not related to this PR, but noticed this: Combined with #13191, we actually now query all webhooks multiple time and add them to the cache multiple times. Shouldn't we only add that one specific WFs webhooks in the addWebhooks
?
Codecov ReportAttention: Patch coverage is
📢 Thoughts on this report? Let us know! |
n8n
|
Project |
n8n
|
Branch Review |
cat-637
|
Run status |
|
Run duration | 04m 37s |
Commit |
|
Committer | Tomi Turtiainen |
View all properties for this run ↗︎ |
Test results | |
---|---|
|
0
|
|
0
|
|
5
|
|
0
|
|
436
|
View all changes introduced in this branch ↗︎ |
✅ All Cypress E2E specs passed |
Got released with |
Summary
Since the first commit, to activate workflows we retrieve all workflows persisted as active and loop over them to activate their webhooks, triggers and pollers, registering them as active in memory if having triggers or pollers. Later we added also populating the webhook cache if having webhooks.
Since workflow activation happens after we stand up the server, we have a race condition between user edits to a workflow's active state and the activation loop. As a result, a user-deactivated workflow can end up active when later processed by the loop. The window for this to happen is longer the more workflows the instance has to activate.
To reduce (but not eliminate) this risk, this PR skips activation of workflows that were deactivated between the start of the loop and the exact time when they are about to be activated. To eliminate this risk, we'd have to move away from the bulk activation loop entirely, which will be a larger effort.
Related Linear tickets, Github issues, and Community forum posts
https://linear.app/n8n/issue/CAT-637
Review / Merge checklist
release/backport
(if the PR is an urgent fix that needs to be backported)