You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Nov 9, 2020. It is now read-only.
Currently vmdk-opsd stop just stops (exits) the vmdk_ops.py service. If there were operations in fligth it could kill them in the middle of execution. That can potentially create inconsistencies in VM attachement, KV files or auth-db.
To prevent this, we need to drain the ops in flight before stopping.
Something along this line
[ ] On 'stop' requests, add a barrier so executeRequest does fails all new incoming requests with "The server is stopping, operation declined".
[ ] Add count of ops in flight, and when barrier is installed makes sure the count drops to zero before proceeding
[ ] Add a way to force-kill the service in case something is stuck (e.g. a 10TB format is running). This will behave as today's "stop" but issue additional message about potential danger. I suggest vmdk-opsd kill.
[ ] Add a message for "status" that the "service is being stopped, waiting for completion of X operations in flight "
Currently
vmdk-opsd stop
just stops (exits) the vmdk_ops.py service. If there were operations in fligth it could kill them in the middle of execution. That can potentially create inconsistencies in VM attachement, KV files or auth-db.To prevent this, we need to drain the ops in flight before stopping.
Something along this line
[ ] On 'stop' requests, add a barrier so executeRequest does fails all new incoming requests with "The server is stopping, operation declined".
[ ] Add count of ops in flight, and when barrier is installed makes sure the count drops to zero before proceeding
[ ] Add a way to force-kill the service in case something is stuck (e.g. a 10TB format is running). This will behave as today's "stop" but issue additional message about potential danger. I suggest
vmdk-opsd kill
.[ ] Add a message for "status" that the "service is being stopped, waiting for completion of X operations in flight "
//CC @pdhamdhere
The text was updated successfully, but these errors were encountered: