Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

set node.StatusUpdatedAt in raft #5746

Merged
merged 2 commits into from
Jun 5, 2019
Merged

set node.StatusUpdatedAt in raft #5746

merged 2 commits into from
Jun 5, 2019

Conversation

notnoop
Copy link
Contributor

@notnoop notnoop commented May 21, 2019

Fix a case where node.StatusUpdatedAt was manipulated directly in
memory. This ensures that StatusUpdatedAt is set in raft layer.

Also, this ensures that the field is updated when node drain/eligibility is updated too. Previously, StatusUpdatedAt is only updated at next node heartbeat. Interestingly, the drain case was handled previously in 76b913c but it got dropped at some point, but not sure why.

Fix a case where `node.StatusUpdatedAt` was manipulated directly in
memory.

This ensures that StatusUpdatedAt is set in raft layer, and ensures that
the field is updated when node drain/eligibility is updated too.
Copy link
Member

@schmichael schmichael left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is Node.StatusUpdatedAt purely informational or does logic depend on it?

I ask because it seems like during server upgrades old servers will be setting UpdatedAt=0 which will overwrite Node.StatusUpdatedAt times that are >0. The pathological case being if the time is used for determining client liveness and suddenly during an upgrade 2/3 of the clients get marked as lost.

We could guard the writes with a >0 check, but if it's purely informational I don't think it's worth worrying about as the the times will be corrected once all servers are updated and all clients have heartbeated.

@notnoop
Copy link
Contributor Author

notnoop commented May 22, 2019

Is Node.StatusUpdatedAt purely informational or does logic depend on it?

It's purely informational, as it's returned in API. The only place that nomad code reads it is when StateStore.UpsertNode emits a (re-)registration event and marks the timestamp [1].

[1]

appendNodeEvents(index, node, []*structs.NodeEvent{
structs.NewNodeEvent().SetSubsystem(structs.NodeEventSubsystemCluster).
SetMessage(NodeRegisterEventReregistered).
SetTimestamp(time.Unix(node.StatusUpdatedAt, 0))})
}
node.Drain = exist.Drain // Retain the drain mode
node.SchedulingEligibility = exist.SchedulingEligibility // Retain the eligibility
node.DrainStrategy = exist.DrainStrategy // Retain the drain strategy
} else {
// Because this is the first time the node is being registered, we should
// also create a node registration event
nodeEvent := structs.NewNodeEvent().SetSubsystem(structs.NodeEventSubsystemCluster).
SetMessage(NodeRegisterEventRegistered).
SetTimestamp(time.Unix(node.StatusUpdatedAt, 0))

@notnoop notnoop merged commit 3f21aba into master Jun 5, 2019
@notnoop notnoop deleted the b-no-updating-inmem-node branch June 5, 2019 23:05
@github-actions
Copy link

github-actions bot commented Feb 9, 2023

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 9, 2023
# for free to subscribe to this conversation on GitHub. Already have an account? #.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants