Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

SubPath unmount fails with "directory not empty" after smb-server Service IP changes #222

Closed
drigz opened this issue Feb 3, 2021 · 3 comments · Fixed by #268
Closed
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@drigz
Copy link

drigz commented Feb 3, 2021

What happened:

After the smb-server Service IP changes, I am unable to delete pods that have mounted smb volumes. The pods get stuck in "Terminating", and the kubelet logs contain:

Feb 03 18:40:58  kubelet[862425]: W0203 18:40:58.568146  862425 mount_helper_common.go:33] Warning: Unmount skipped because path does not exist: /var/lib/kubelet/pods/ddc7ea95-c27f-4b3c-9136-cb6acea5f0f6/volume-subpaths/volume/volume/1
Feb 03 18:41:19  kubelet[862425]: W0203 18:41:19.048156  862425 mount_helper_common.go:33] Warning: Unmount skipped because path does not exist: /var/lib/kubelet/pods/ddc7ea95-c27f-4b3c-9136-cb6acea5f0f6/volume-subpaths/volume/volume/2
Feb 03 18:41:19  kubelet[862425]: E0203 18:41:19.048366  862425 nestedpendingoperations.go:301] Operation for "{volumeName:kubernetes.io/csi/smb.csi.k8s.io^volume podName:ddc7ea95-c27f-4b3c-9136-cb6acea5f0f6 nodeName:}" failed. No retries permitted until 2021-02-03 18:43:21.048305415 +0100 CET m=+12623.734651277 (durationBeforeRetry 2m2s). Error: "error cleaning subPath mounts for volume \"volume\" (UniqueName: \"kubernetes.io/csi/smb.csi.k8s.io^volume\") pod \"ddc7ea95-c27f-4b3c-9136-cb6acea5f0f6\" (UID: \"ddc7ea95-c27f-4b3c-9136-cb6acea5f0f6\") : error deleting /var/lib/kubelet/pods/ddc7ea95-c27f-4b3c-9136-cb6acea5f0f6/volume-subpaths/volume/volume: remove /var/lib/kubelet/pods/ddc7ea95-c27f-4b3c-9136-cb6acea5f0f6/volume-subpaths/volume/volume: directory not empty"

The "path does not exist" warning is wrong:

$ sudo ls /var/lib/kubelet/pods/ddc7ea95-c27f-4b3c-9136-cb6acea5f0f6/volume-subpaths/volume/volume/1
ls: cannot access '/var/lib/kubelet/pods/ddc7ea95-c27f-4b3c-9136-cb6acea5f0f6/volume-subpaths/volume/volume/1': Host is down

What you expected to happen:

Pods can be deleted and recreated.

How to reproduce it:

  1. Create a volume based on smb-server.yaml.
  2. Mount the volume in a pod.
  3. Delete and recreate the smb-server service to change its IP. (I think this can happen other ways, I just did this to reproduce an error we encountered on another cluster)
  4. Restart the smb-server process. (not sure why this is required, maybe to break existing connections?)
  5. Try to delete the pod.

Anything else we need to know?:

#64 seems similar but the error is different.
kubernetes/kubernetes#97031 seems to be a similar codepath (and the fix may be the same?) but with a different cause.

Environment:

  • CSI Driver version: 0.4.0
  • Kubernetes version (use kubectl version): 1.18.10
  • OS (e.g. from /etc/os-release): Debian testing
  • Kernel (e.g. uname -a): 5.7.17
  • Install tools: manually copied YAMLs from this repo
  • Others:
@drigz
Copy link
Author

drigz commented Feb 3, 2021

I also have a question: if changing the service IP addresses breaks existing mounts, would it be better to specify a clusterIP in the smb-server Service?

@drigz
Copy link
Author

drigz commented Feb 3, 2021

Note: if you try to create a new pod instead of deleting an old one, you get the MountVolume.MountDevice failed for volume "workcell-spec" : stat /var/lib/kubelet/plugins/kubernetes.io/csi/pv/workcell-spec/globalmount: host is down error described in #164. I have also seen that error in our live clusters, although I don't if it correlated with an IP change in those cases.

@andyzhangx
Copy link
Member

looks like it's related to #164, need upstream fix, will take a look later.

@andyzhangx andyzhangx added the kind/bug Categorizes issue or PR as related to a bug. label Feb 19, 2021
andyzhangx added a commit to andyzhangx/csi-driver-smb that referenced this issue Aug 11, 2023
670bb0ef1 Merge pull request kubernetes-csi#229 from marosset/fix-codespell-errors
35d5e783c Merge pull request kubernetes-csi#219 from yashsingh74/update-registry
63473cc96 Merge pull request kubernetes-csi#231 from coulof/bump-go-version-1.20.5
29a5c76c7 Merge pull request kubernetes-csi#228 from mowangdk/chore/adopt_kubernetes_recommand_labels
8dd28211b Update cloudbuild image with go 1.20.5
1df23dba6 Merge pull request kubernetes-csi#230 from msau42/prow
1f92b7e7c Add ginkgo timeout to e2e tests to help catch any stuck tests
2b8b80ead fixing some codespell errors
c10b67804 Merge pull request kubernetes-csi#227 from coulof/check-sidecar-supported-versions
72984ec0a chore: adopt kubernetes recommand label
b05553510 Header
bd0a10b65 typo
c39d73c33 Add comments
f6491af0e Script to verify EOL sidecar version
4133d1df0 Merge pull request kubernetes-csi#226 from msau42/cloudbuild
8d519d237 Pin buildkit to v0.10.6 to workaround v0.11 bug with docker manifest
6e04a0301 Merge pull request kubernetes-csi#224 from msau42/cloudbuild
26fdfffdd Update cloudbuild image
6613c3980 Merge pull request kubernetes-csi#223 from sunnylovestiramisu/update
0e7ae993d Update k8s image repo url
77e47cce8 Merge pull request kubernetes-csi#222 from xinydev/fix-dep-version
155854b09 Fix dep version mismatch
8f839056a Merge pull request kubernetes-csi#221 from sunnylovestiramisu/go-update
1d3f94dd5 Update go version to 1.20 to match k/k v1.27
901bcb5a9 Update registry k8s.gcr.io -> registry.k8s.io

git-subtree-dir: release-tools
git-subtree-split: 670bb0ef135a53be44643cc34440eff22ad3ac8c
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants