-
Notifications
You must be signed in to change notification settings - Fork 95
TestConcurrency is failing: volume creation is failing against docker 1.13 on ubuntu 16.04 VMs #881
Comments
Hi Ritesh, can you please take a look at this issue? What's the purpose of this test? |
Have tried the test again, works fine locally. In CI am seeing an error in removing the test volume which is different from whats reported in the issue. Debugging this further. |
I've reproduced this issue several times with a single docker host and the way its designed there is no guarantee that this error "volume not found" will not happen. Basically up to a default of 5 or 2 threads will be created to run create and delete of volumes in parallel and unless those are on separate datastores there is every chance that one thread will remove a volume and the other will get an error. As long as there aren't any of the KV file not found or any hang on lock type issues the remove error reported in the test is expected. I don't see any change for this issue except may be to allow the test to report error only for genuine cases like create error which should succeed. There can be genuine volume create/remove errors which should be caught in other tests as well. For this test suggest logging errors vs. failing the test. |
Hello @govint
How could this happen if each goroutine is working with a different set of volume names ? |
Ok, let me say, I've been testing different changes and lied a bit above. No, this test works fine on a local setup (like I updated earlier), but I do get the errors in CI. And at least one looks like a repro of #954, so I'll debug that and the remove vol errors I'd seen earlier. |
Able to consistently repro #954 with the concurrent test with two clients. Will be debugging that exclusively. |
Ok, I was able to test this on a pair of Ubuntu VMs (14.04) and the test runs perfectly well. Modified the test to do just the parallel volume create and remove between two VMs for a total of fifty volumes just to see if that many reproduces the problem. I'm unable to repro the issue at all. But in CI its consistently reproducible and the issue is exactly whats reported in #954 (which seems a duplicate of this issue). The buf pointer returned from C looks valid and for some reason the C.Free() seems to be getting a fault - not always but its consistent in this test. |
Photon OS issue - vmware/photon#614 |
The issue of the concurrent tests failing seems isolated to Photon OS and seems to be reproduced with a specific version of Photon OS with 6.0 ESX. Photon OS 4.4.41-1 with ESX 6.0P04 and ESX 6.5 doesn't repro the issue. |
Steps to reproduce:
Note: TestConcurrency is commented out with #879, please uncomment concurrency tests after fixing the reported issue
vmdk_ops log
some observation:
//CC @kerneltime
The text was updated successfully, but these errors were encountered: