Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[Reset] Handle completion events from restarted children #7295

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

gow
Copy link
Contributor

@gow gow commented Feb 7, 2025

What changed?

In this PR we are doing 2 things,

  1. Cleaning up the entries in ChildrenInitializedPostResetPoint
  2. Verifying that the results are indeed from the correct restarted child before accepting the completion event.

Why?

  • Cleans up the mutable state and reduces bloat.
  • Avoids a race condition where a old child completed in between the time when new child Init is recorded and when it is actually started.

How did you test it?

Added unit tests + manual testing.

Potential risks

N/A

Documentation

Pending

Is hotfix candidate?

No

@gow gow requested a review from a team as a code owner February 7, 2025 20:00
@gow gow marked this pull request as draft February 7, 2025 20:00
@gow gow requested a review from yycptt February 7, 2025 20:00
@gow gow marked this pull request as ready for review February 7, 2025 22:18
return nil, consts.ErrChildExecutionNotFound
}
initiatedAttr := initiatedEvent.GetStartChildWorkflowExecutionInitiatedEventAttributes()
childID := fmt.Sprintf("%s:%s", initiatedAttr.GetWorkflowType().Name, initiatedAttr.GetWorkflowId())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we really should create a util function for this concatenation logic. I think I've seen it in 3 different places.

childrenInitializedAfterResetPoint := mutableState.GetChildrenInitializedPostResetPoint()
if len(childrenInitializedAfterResetPoint) > 0 {
// This parent was reset and it also has some children that potentially were restarted.
initiatedEvent, err := mutableState.GetChildExecutionInitiatedEvent(ctx, parentInitiatedID)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm I don't think we need to load the initiatedEvent. Both child workflow type and child workflowID are available in ChildExecutionInfo in mutable state I think.

}
// The results are from the child that this parent initiated. We should stop tracking the child and process the result now.
delete(childrenInitializedAfterResetPoint, childID)
mutableState.SetChildrenInitializedPostResetPoint(childrenInitializedAfterResetPoint)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plz update ms.approximateSize when setting the reset point.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants