Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

failed to unmount due to "host is down" #64

Closed
andyzhangx opened this issue Jul 11, 2020 · 9 comments
Closed

failed to unmount due to "host is down" #64

andyzhangx opened this issue Jul 11, 2020 · 9 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@andyzhangx
Copy link
Member

What happened:
This issue happened on k8s v1.15.11, need to check whether latest k8s version has fixed this issue:

Jul 11 07:53:08 aks-agentpool-60632172-vmss000007 kubelet[4580]: I0711 07:53:08.728754    4580 controlbuf.go:382] transport: loopyWriter.run returning. connection error: desc = "transport is closing"
Jul 11 07:53:18 aks-agentpool-60632172-vmss000007 kubelet[4580]: E0711 07:53:18.968266    4580 csi_mounter.go:428] kubernetes.io/csi: isDirMounted IsLikelyNotMountPoint test failed for dir [/var/lib/kubelet/pods/aebb8a69-b5fe-4b36-b8cc-59800e5e6fa6/volumes/kubernetes.io~csi/pvc-255b2e00-87d3-4d33-b02d-6cdc2d6394b1/mount]
Jul 11 07:53:18 aks-agentpool-60632172-vmss000007 kubelet[4580]: E0711 07:53:18.968312    4580 csi_mounter.go:378] kubernetes.io/csi: mounter.TearDownAt failed to clean mount dir [/var/lib/kubelet/pods/aebb8a69-b5fe-4b36-b8cc-59800e5e6fa6/volumes/kubernetes.io~csi/pvc-255b2e00-87d3-4d33-b02d-6cdc2d6394b1/mount]: stat /var/lib/kubelet/pods/aebb8a69-b5fe-4b36-b8cc-59800e5e6fa6/volumes/kubernetes.io~csi/pvc-255b2e00-87d3-4d33-b02d-6cdc2d6394b1/mount: host is down
Jul 11 07:53:18 aks-agentpool-60632172-vmss000007 kubelet[4580]: E0711 07:53:18.968397    4580 nestedpendingoperations.go:270] Operation for "\"kubernetes.io/csi/smb.csi.k8s.io^pvc-255b2e00-87d3-4d33-b02d-6cdc2d6394b1\" (\"aebb8a69-b5fe-4b36-b8cc-59800e5e6fa6\")" failed. No retries permitted until 2020-07-11 07:53:19.968350377 +0000 UTC m=+174446.119043970 (durationBeforeRetry 1s). Error: "UnmountVolume.TearDown failed for volume \"smb\" (UniqueName: \"kubernetes.io/csi/smb.csi.k8s.io^pvc-255b2e00-87d3-4d33-b02d-6cdc2d6394b1\") pod \"aebb8a69-b5fe-4b36-b8cc-59800e5e6fa6\" (UID: \"aebb8a69-b5fe-4b36-b8cc-59800e5e6fa6\") : stat /var/lib/kubelet/pods/aebb8a69-b5fe-4b36-b8cc-59800e5e6fa6/volumes/kubernetes.io~csi/pvc-255b2e00-87d3-4d33-b02d-6cdc2d6394b1/mount: host is down"

What you expected to happen:

How to reproduce it:

Anything else we need to know?:

Environment:

  • CSI Driver version:
  • Kubernetes version (use kubectl version): 1.15.11
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:
@andyzhangx andyzhangx added the kind/bug Categorizes issue or PR as related to a bug. label Jul 11, 2020
@boddumanohar
Copy link
Contributor

@andyzhangx Can you please more instructions on how to reproduce the issue? Would like to see if I can fix this.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 6, 2020
@andyzhangx andyzhangx removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 6, 2020
@andyzhangx
Copy link
Member Author

andyzhangx commented Feb 23, 2021

  • similar error msgs:
-- Logs begin at Sat 2021-02-20 16:54:54 GMT, end at Mon 2021-02-22 22:04:28 GMT. --
Feb 20 16:54:56 agent-node000002 kubelet[1917]: E0220 16:54:56.193070    1917 kubelet_volumes.go:65] pod "2517cdab-e91e-4006-b5a0-e824bb25f83c" found, but error stat /var/lib/kubelet/pods/2517cdab-e91e-4006-b5a0-e824bb25f83c/volumes/kubernetes.io~azure-file/app-name-persistent-volume: host is down occurred during checking mounted volumes from disk
Feb 20 16:54:56 agent-node000002 kubelet[1917]: E0220 16:54:56.193972    1917 kubelet_volumes.go:65] pod "a4a19c0b-d254-467e-9d26-86e84f6b85ed" found, but error stat /var/lib/kubelet/pods/a4a19c0b-d254-467e-9d26-86e84f6b85ed/volumes/kubernetes.io~azure-file/app-name-test-persistent-volume: host is down occurred during checking mounted volumes from disk
Feb 20 16:54:56 agent-node000002 kubelet[1917]: E0220 16:54:56.194774    1917 kubelet_volumes.go:65] pod "d3806c88-a022-46a7-bfb8-a4e20fa992fe" found, but error stat /var/lib/kubelet/pods/d3806c88-a022-46a7-bfb8-a4e20fa992fe/volumes/kubernetes.io~azure-file/dxf-persistent-volume: host is down occurred during checking mounted volumes from disk
Feb 20 16:54:56 agent-node000002 kubelet[1917]: E0220 16:54:56.195421    1917 kubelet_volumes.go:65] pod "2517cdab-e91e-4006-b5a0-e824bb25f83c" found, but error stat /var/lib/kubelet/pods/2517cdab-e91e-4006-b5a0-e824bb25f83c/volumes/kubernetes.io~azure-file/app-name-persistent-volume: host is down occurred during checking mounted volumes from disk
Feb 20 16:54:56 agent-node000002 kubelet[1917]: E0220 16:54:56.196135    1917 kubelet_volumes.go:65] pod "a4a19c0b-d254-467e-9d26-86e84f6b85ed" found, but error stat /var/lib/kubelet/pods/a4a19c0b-d254-467e-9d26-86e84f6b85ed/volumes/kubernetes.io~azure-file/app-name-test-persistent-volume: host is down occurred during checking mounted volumes from disk
Feb 20 16:54:56 agent-node000002 kubelet[1917]: E0220 16:54:56.196903    1917 kubelet_volumes.go:65] pod "d3806c88-a022-46a7-bfb8-a4e20fa992fe" found, but error stat /var/lib/kubelet/pods/d3806c88-a022-46a7-bfb8-a4e20fa992fe/volumes/kubernetes.io~azure-file/dxf-persistent-volume: host is down occurred during checking mounted volumes from disk

@andyzhangx
Copy link
Member Author

Finally I spend some time, trying to fix this issue, there should be two PRs at least, first PR: kubernetes/utils#203

@andyzhangx
Copy link
Member Author

would be fixed by this PR: kubernetes/kubernetes#101305

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 20, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Aug 19, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closing this issue.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants