Check attached status in removeVMDK before trying to remove volume #720

brunotm · 2016-11-08T11:59:36Z

govint · 2016-11-08T12:11:48Z

esx_service/vmdk_ops.py

+    attached, uuid, attach_as = getStatusAttached(vmdk_path)
+    if attached:
+        vm = findVmByUuid(uuid)
+        logging.info("*** removeVMDK: %s is in use by %s VM uuid = %s",


On another host, the uuid will not translate to a VM object and can't be used. I was thinking more like the changes below,

kv_status_attached, kv_uuid, attach_mode = getStatusAttached(vmdk_path) if kv_status_attached and kv_uuid != None: return err("Failed to remove %s - disk is in use by VM %s." % (vmdk_path, kv_uuid))

Its possible that KV is marked as attached but the VM UUID itself is not set. Which should not happen but just in case that scenario does occur.

@govint
you're right :) i can close this PR if you already have it done already

No, please go ahead and lets put this thru @kerneltime's test and close the issue.

kerneltime · 2016-11-08T18:47:47Z

In the absence of our own locking framework around vmdk + key/val and the existing bug, this seems like a reasonable fix that will reduce the probability of hitting the bug.
@brunotm did you run any tests for this change locally? Can you update the PR with results?

kerneltime · 2016-11-08T19:05:24Z

I ran a quick test for this code change.
ESX log on delete when volume is in use

07/14/16 03:40:13 2760808 [photon.vsan.2-storage2] [DEBUG  ] lib.vmci_reply: VMCI replied with errcode 0
07/14/16 03:40:13 2760808 [photon.vsan.2-storage2] [DEBUG  ] Released lock: storage2
07/14/16 03:40:13 2760808 [MainThread] [DEBUG  ] lib.vmci_get_one_op returns 4, buffer '{"cmd":"remove","details":{"Name":"storage2"}}'
07/14/16 03:40:13 2760808 [photon.vsan.2-storage2] [DEBUG  ] LockManager.get_lock: new lock: storage2, <thread.lock object at 0xfffa9770>
07/14/16 03:40:13 2760808 [photon.vsan.2-storage2] [DEBUG  ] LockManager existing locks in store: [u'storage2']
07/14/16 03:40:13 2760808 [photon.vsan.2-storage2] [DEBUG  ] Trying to aquire lock: storage2
07/14/16 03:40:13 2760808 [photon.vsan.2-storage2] [DEBUG  ] Aquired lock: storage2
07/14/16 03:40:13 2760808 [photon.vsan.2-storage2] [DEBUG  ] get_datastore_name: path=vsan:5201dd577fe793bd-112186a6c2b42e0c name=['vsanDatastore']
07/14/16 03:40:13 2760808 [photon.vsan.2-storage2] [DEBUG  ] get tenant vm_uuid=564d6f23-eabb-1b63-c79d-2086cca704d5 tenant_id=4acd19ee-1ed6-4716-bcda-c8637c9658f2
07/14/16 03:40:13 2760808 [photon.vsan.2-storage2] [DEBUG  ] Authorize: vm_uuid=564d6f23-eabb-1b63-c79d-2086cca704d5
07/14/16 03:40:13 2760808 [photon.vsan.2-storage2] [DEBUG  ] Authorize: datastore=vsanDatastore
07/14/16 03:40:13 2760808 [photon.vsan.2-storage2] [DEBUG  ] Authorize: cmd=remove
07/14/16 03:40:13 2760808 [photon.vsan.2-storage2] [DEBUG  ] Authorize: opt={}
07/14/16 03:40:13 2760808 [photon.vsan.2-storage2] [DEBUG  ] get tenant vm_uuid=564d6f23-eabb-1b63-c79d-2086cca704d5 tenant_id=4acd19ee-1ed6-4716-bcda-c8637c9658f2
07/14/16 03:40:13 2760808 [photon.vsan.2-storage2] [DEBUG  ] get_privileges tenant_uuid=4acd19ee-1ed6-4716-bcda-c8637c9658f2 datastore=vsanDatastore
07/14/16 03:40:13 2760808 [photon.vsan.2-storage2] [DEBUG  ] delete_volume=1
07/14/16 03:40:13 2760808 [photon.vsan.2-storage2] [INFO   ] cmd remove with opts {} on tenant_uuid 4acd19ee-1ed6-4716-bcda-c8637c9658f2 datastore vsanDatastore is allowed to execute
07/14/16 03:40:13 2760808 [photon.vsan.2-storage2] [DEBUG  ] Found /vmfs/volumes/vsanDatastore/dockvols/tenant1, returning
07/14/16 03:40:13 2760808 [photon.vsan.2-storage2] [DEBUG  ] executeRequest tenant1 /vmfs/volumes/vsanDatastore/dockvols/tenant1
07/14/16 03:40:13 2760808 [photon.vsan.2-storage2] [INFO   ] *** removeVMDK: /vmfs/volumes/vsanDatastore/dockvols/tenant1/storage2.vmdk
07/14/16 03:40:13 2760808 [photon.vsan.2-storage2] [DEBUG  ] Trying to acquire lock: <_RLock owner=None count=0> @ 0x5cc76d0c, caller: load, args: (u'/vmfs/volumes/vsanDatastore/dockvols/tenant1/storage2.vmdk',) {}
07/14/16 03:40:13 2760808 [photon.vsan.2-storage2] [DEBUG  ] Aquired lock: <_RLock owner='photon.vsan.2-storage2' count=1> @ 0x5cc76d0c, caller: load, args: (u'/vmfs/volumes/vsanDatastore/dockvols/tenant1/storage2.vmdk',) {}
07/14/16 03:40:13 2760808 [photon.vsan.2-storage2] [INFO   ] *** removeVMDK: /vmfs/volumes/vsanDatastore/dockvols/tenant1/storage2.vmdk is in use, VM uuid = 564d672f-0576-8906-7ba0-a42317328499
07/14/16 03:40:13 2760808 [photon.vsan.2-storage2] [INFO   ] executeRequest 'remove' completed with ret={u'Error': 'Failed to remove volume /vmfs/volumes/vsanDatastore/dockvols/tenant1/storage2.vmdk, is in use by VM uuid = 564d672f-0576-8906-7ba0-a42317328499.'}
07/14/16 03:40:13 2760808 [photon.vsan.2-storage2] [DEBUG  ] lib.vmci_reply: VMCI replied with errcode 0
07/14/16 03:40:13 2760808 [photon.vsan.2-storage2] [DEBUG  ] Released lock: storage2

The volume continues to be in good state on reattach and remove fails if the volume is attached.

The log message on the VM trying to remove

root@photon-WHOON0N5Y [ ~ ]# docker volume rm storage2
Error response from daemon: Error while removing volume storage2: remove storage2: VolumeDriver.Remove: Failed to remove volume /vmfs/volumes/vsanDatastore/dockvols/tenant1/storage2.vmdk, is in use by VM uuid = 564d672f-0576-8906-7ba0-a42317328499.

brunotm · 2016-11-08T19:06:56Z

@kerneltime, sorry for not posting the tests earlier.

from host1:

root@dev01:/usr/lib/go/src/github.com/vmware/docker-volume-vsphere# docker volume create -d vmdk --name vol$$-1
vol115475-1
root@dev01:/usr/lib/go/src/github.com/vmware/docker-volume-vsphere# docker run -it -v vol$$-1:/mnt busybox
/ #

from host2:

root@vsc01:~# docker volume rm vol115475-1
Error response from daemon: Error while removing volume vol115475-1: remove vol115475-1: VolumeDriver.Remove: Failed to remove volume /vmfs/volumes/datastore2/dockvols/vol115475-1.vmdk, is in use by VM uuid = 4236919a-d0fd-831b-4c58-4e8bf8b0cfff.

from esx:

11/08/16 19:02:01 829948 [vsc01-vol115475-1] [INFO   ] *** removeVMDK: /vmfs/volumes/datastore2/dockvols/vol115475-1.vmdk is in use, VM uuid = 4236919a-d0fd-831b-4c58-4e8bf8b0cfff
11/08/16 19:02:01 829948 [vsc01-vol115475-1] [INFO   ] executeRequest 'remove' completed with ret={u'Error': 'Failed to remove volume /vmfs/volumes/datastore2/dockvols/vol115475-1.vmdk, is in use by VM uuid = 4236919a-d0fd-831b-4c58-4e8bf8b0cfff.'}
11/08/16 19:02:01 829948 [vsc01-vol115475-1] [DEBUG  ] lib.vmci_reply: VMCI replied with errcode 0
11/08/16 19:02:01 829948 [vsc01-vol115475-1] [DEBUG  ] Released lock: vol115475-1

from host1:

/ # root@dev01:/usr/lib/go/src/github.com/vmware/docker-volume-vsphere# docker ps -a
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS                     PORTS               NAMES
0046fbd8ce5f        busybox             "sh"                3 minutes ago       Exited (0) 6 seconds ago                       pensive_lumiere
root@dev01:/usr/lib/go/src/github.com/vmware/docker-volume-vsphere# docker rm 0046fbd8ce5f
0046fbd8ce5f
root@dev01:/usr/lib/go/src/github.com/vmware/docker-volume-vsphere# docker volume ls
DRIVER              VOLUME NAME
vmdk                vol115475-1
root@dev01:/usr/lib/go/src/github.com/vmware/docker-volume-vsphere# docker volume rm vol$$-1
vol115475-1
root@dev01:/usr/lib/go/src/github.com/vmware/docker-volume-vsphere#

from esx:

11/08/16 19:05:48 829948 [dev01-vol115475-1] [INFO   ] *** removeVMDK: /vmfs/volumes/datastore2/dockvols/vol115475-1.vmdk
11/08/16 19:05:48 829948 [dev01-vol115475-1] [DEBUG  ] Trying to acquire lock: <_RLock owner=None count=0> @ 0x58463dec, caller: load, args: ('/vmfs/volumes/datastore2/dockvols/vol115475-1.vmdk',) {}
11/08/16 19:05:48 829948 [dev01-vol115475-1] [DEBUG  ] Aquired lock: <_RLock owner='dev01-vol115475-1' count=1> @ 0x58463dec, caller: load, args: ('/vmfs/volumes/datastore2/dockvols/vol115475-1.vmdk',) {}
11/08/16 19:05:48 829948 [dev01-vol115475-1] [DEBUG  ] Running cmd /sbin/vmkfstools -U  /vmfs/volumes/datastore2/dockvols/vol115475-1.vmdk
11/08/16 19:05:49 829948 [dev01-vol115475-1] [INFO   ] executeRequest 'remove' completed with ret=None
11/08/16 19:05:49 829948 [dev01-vol115475-1] [DEBUG  ] lib.vmci_reply: VMCI replied with errcode 0
11/08/16 19:05:49 829948 [dev01-vol115475-1] [DEBUG  ] Released lock: vol115475-1

kerneltime · 2016-11-08T19:09:43Z

esx_service/vmdk_ops.py

+    # Check the current volume status
+    attached, uuid, attach_as = getStatusAttached(vmdk_path)
+    if attached:
+        logging.info("*** removeVMDK: %s is in use, VM uuid = %s", vmdk_path, uuid)


Nit: would be nice to include the vm_name as well.

@kerneltime
It was there in the first incarnation, but @govint alerted me:

On another host, the uuid will not translate to a VM object and can't be used.

Ok. Makes sense

pdhamdhere · 2016-11-08T20:34:22Z

esx_service/vmdk_ops.py


+    # Check the current volume status
+    attached, uuid, attach_as = getStatusAttached(vmdk_path)
+    if attached:


Information in KV file could be stale thus preventing user from deleting volume ever. I believe this is a workaround to real bug where Unlink/Delete deletes sidecar (VMFD) but leaves VMDK behind if VMDK is already in use. Does this bug exist with API too?

Also would it be better to query underlying system rather than relying on KV? For e.g. "vmkfstools -D" tells you whether file is locked and by whom. If it is locked, fail delete?

@pdhamdhere
Yes, the real problem is vmkfstools unlinking associated vmdk files without any verification. I still didn't had time to test with the vim.virtualdiskmanager and wasn't aware of the -D flag. Will try to do this later today.

Maybe we also need an background async mechanism to detect and update the cases of stale metadata and avoid being compelled to no trust it?

@pdhamdhere
Using the api has the same destructive effect as vmkfstools -U.

Using vmkfstools -D to find out if its in use works, but not for the volume.vmdk, In my case was the volume-flat.vmdk.

According with this document we should search for the owner being set.

Do we have identified the conditions or code paths for the stale metadata cases? If not a async mechanism (in relation to the ops), we could have consistency/integrity checks at the end of each op that change metadata (while we still hold the data for saving or update).

vmkfstools -D /vmfs/volumes/datastore2/dockvols/vol1174812.vmdk

Lock [type 10c00001 offset 99715072 v 3122, hb offset 3170304 gen 169, mode 0, owner 00000000-00000000-0000-000000000000 mtime 139245 num 0 gblnum 0 gblgen 0 gblbrk 0] Addr <4, 222, 33>, gen 3104, links 1, type reg, flags 0, uid 0, gid 0, mode 600 len 566, nb 0 tbz 0, cow 0, newSinceEpoch 0, zla 4305, bs 8192

vmkfstools -D /vmfs/volumes/datastore2/dockvols/vol1174812-671ef9f932576c16.vmfd

Lock [type 10c00001 offset 99717120 v 3132, hb offset 3170304 gen 169, mode 0, owner 00000000-00000000-0000-000000000000 mtime 139221 num 0 gblnum 0 gblgen 0 gblbrk 0] Addr <4, 222, 34>, gen 3108, links 1, type reg, flags 0, uid 0, gid 0, mode 600 len 4096, nb 1 tbz 0, cow 0, newSinceEpoch 1, zla 2, bs 8192

_Field owner is set_
vmkfstools -D /vmfs/volumes/datastore2/dockvols/vol1174812-flat.vmdk

Lock [type 10c00001 offset 99713024 v 3114, hb offset 3170304 gen 169, mode 1, owner 581272ea-1208c2da-516b-b8aeedeb6b16 mtime 139203 num 0 gblnum 0 gblgen 0 gblbrk 0] Addr <4, 222, 32>, gen 3103, links 1, type reg, flags 0, uid 0, gid 0, mode 600 len 104857600, nb 15 tbz 0, cow 0, newSinceEpoch 15, zla 1, bs 1048576

Forgot to say that when the volume is not in use the lock fields are populated the same as above.

@brunotm - we cannot use a lock on -flat, because there is no -flat on VSAN or VVOL ; it would be possible but clumsy to find and check locks there

We can of course try lsof but screen scraping is not healthy :-).

As @pdhamdhere mentioned, relying on KV to deny 'deleteVMDK' will break some cases, e.g. the following:

volume attached, then VM powered off

KV says "attached". Volume becomes undeletable from Docker.

A few choices for discussion:

Use existing flow, but retain KV content (in-memory backup :-)) before delete. If delete fails recover from in-memory backup

Use suggested PR, and add "volume delete" to vmdkops_admin (and upcoming API) for troubleshoot/recover

something else ?

(1) Use existing flow, but retain KV content (in-memory backup :-)) before delete. If delete fails recover from in-memory backup

is brilliant workaround!

If a volume is attached and VM is powered off then lets do this,

a. check if the volume is attached and
b. check if the VM is powered off and if yes then detach the disk from the VM.

And all of this happens already for the attach path- vmdk_ops.py:handle_stale_attach() :-). Call the same for the removeVMDK path as well.

@brunotm, please call handle_stale_attach() if the volume is marked as attached. That is a better check than just to see if the volume is attached. If attached and the attaching VM is powered-off then go ahead clean up the volume config and then remove the VMDK.

@govint - good point about handle_stale_attach() reuse, it seems to be simple way to improve the check. However - I am very concerned that it's getting too complex with too many if-then-else and race potentials. E.g. what would happen if the disk is attached to a VM registered on a different ESX host ? Or if VM with the disk is powered off, but for some reason KV does not show attached (I know, it "should not" happen, but the experience says that bugs exist and it could happen).

kerneltime · 2016-11-08T23:55:22Z

As mentioned before, this PR is not a fix but a way to reduce the probability of getting into a bad state. The code change here is straightforward with the caveat that the kv store is not authoritative for locking state. To make the source authoritative I would recommend vmfsfilelockinfo

[root@localhost:/vmfs/volumes/57302fae-961b9cf9-2e49-000c29acb565/dockvols/tenant1] vmfsfilelockinfo -p storage-flat.vmdk 
vmfsfilelockinfo Version 1.0
Looking for lock owners on "storage-flat.vmdk"
"storage-flat.vmdk" is locked in Exclusive mode by host having mac address ['00:0c:29:ac:b5:65']
Please configure ESXi firewall to connect to Virtual Center
[root@localhost:/vmfs/volumes/57302fae-961b9cf9-2e49-000c29acb565/dockvols/tenant1] vmfsfilelockinfo -p storage-flat.vmdk 
vmfsfilelockinfo Version 1.0
Looking for lock owners on "storage-flat.vmdk"
"storage-flat.vmdk" is not locked by any ESX host and is Free

govint · 2016-11-09T07:44:17Z

The fix is good and a missing check on the removeVMDK() path. With tenant based auth and multiple VMs in a tenant and no guarantee on all those VMs running on the same host this check on the remove path is really the bug.

govint

Call handle_stale_attach() and if no error then proceed to remove the volume

govint

Approved once changes are rebased to current code.

the volume

…ched, and only proceed if sucessful

brunotm · 2016-11-10T12:52:21Z

@govint
Done.

govint reviewed Nov 8, 2016

View reviewed changes

brunotm force-pushed the brunotm.issue719 branch from f29b876 to a8bbccc Compare November 8, 2016 12:48

kerneltime reviewed Nov 8, 2016

View reviewed changes

kerneltime approved these changes Nov 8, 2016

View reviewed changes

pdhamdhere reviewed Nov 8, 2016

View reviewed changes

govint reviewed Nov 9, 2016

View reviewed changes

msterin approved these changes Nov 10, 2016

View reviewed changes

govint approved these changes Nov 10, 2016

View reviewed changes

brunotm added 2 commits November 10, 2016 12:08

vmdk_ops: Check attached status in removeVMDK before trying to remove

f410735

the volume

vmdk_ops.removeVMDK: call handle_stale_attach if status in kv is atta…

b4157d6

…ched, and only proceed if sucessful

brunotm force-pushed the brunotm.issue719 branch from 9b9219c to b4157d6 Compare November 10, 2016 12:09

pdhamdhere merged commit 3ed7f77 into vmware-archive:master Nov 10, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Check attached status in removeVMDK before trying to remove volume #720

Check attached status in removeVMDK before trying to remove volume #720

brunotm commented Nov 8, 2016

govint Nov 8, 2016 •

edited

Loading

brunotm Nov 8, 2016 •

edited

Loading

govint Nov 8, 2016

kerneltime commented Nov 8, 2016 •

edited

Loading

kerneltime commented Nov 8, 2016 •

edited

Loading

brunotm commented Nov 8, 2016

kerneltime Nov 8, 2016

brunotm Nov 8, 2016

kerneltime Nov 8, 2016

pdhamdhere Nov 8, 2016

brunotm Nov 8, 2016

brunotm Nov 8, 2016

brunotm Nov 8, 2016

msterin Nov 9, 2016 •

edited

Loading

pdhamdhere Nov 9, 2016

govint Nov 9, 2016 •

edited

Loading

govint Nov 9, 2016

msterin Nov 9, 2016

kerneltime commented Nov 8, 2016

govint commented Nov 9, 2016

govint left a comment

govint left a comment

brunotm commented Nov 10, 2016

Check attached status in removeVMDK before trying to remove volume #720

Check attached status in removeVMDK before trying to remove volume #720

Conversation

brunotm commented Nov 8, 2016

govint Nov 8, 2016 • edited Loading

Choose a reason for hiding this comment

brunotm Nov 8, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kerneltime commented Nov 8, 2016 • edited Loading

kerneltime commented Nov 8, 2016 • edited Loading

brunotm commented Nov 8, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

msterin Nov 9, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

govint Nov 9, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kerneltime commented Nov 8, 2016

govint commented Nov 9, 2016

govint left a comment

Choose a reason for hiding this comment

govint left a comment

Choose a reason for hiding this comment

brunotm commented Nov 10, 2016

govint Nov 8, 2016 •

edited

Loading

brunotm Nov 8, 2016 •

edited

Loading

kerneltime commented Nov 8, 2016 •

edited

Loading

kerneltime commented Nov 8, 2016 •

edited

Loading

msterin Nov 9, 2016 •

edited

Loading

govint Nov 9, 2016 •

edited

Loading