Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Recovery steps documentation #177

Open
aravindavk opened this issue Apr 10, 2022 · 0 comments
Open

Recovery steps documentation #177

aravindavk opened this issue Apr 10, 2022 · 0 comments

Comments

@aravindavk
Copy link
Member

  • When a Manager node goes down - No management operations are possible. Mounted Volumes continue to work but no new mounts are possible.

    • Temporary - Wait till the Management nodes come back online.
    • Permanent failure (Notify/Update Mgr URL in all Storage nodes)
      • Setup a new node with the same or different IP/hostname. Restore the Config data from the backup OR
      • Promote any one existing Storage node and restore the Config data from the backup.
  • When a Storage node goes down

    • Temporary - No need to worry, once the node comes back online then everything will be fine.
    • Permanent failure
      • Setup a new node with the same IP/Hostname and call node re-add command to add the node to the Pool.
      • Setup a new node with a different IP/Hostname, then call node re-add command with flag --new-name=NEW_HOSTNAME
  • Create a new Token for Mgr to Node and Node to Mgr communication(Key Rotate)

      kadalu node new-token PROD/server1.example.com
    

Identify the changes required to Code and update documentation once implemented.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant