Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

kube-proxy get stuck if master is recreated on new instance #56720

Closed
calvix opened this issue Dec 1, 2017 · 9 comments
Closed

kube-proxy get stuck if master is recreated on new instance #56720

calvix opened this issue Dec 1, 2017 · 9 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. sig/network Categorizes an issue or PR as relevant to SIG Network.

Comments

@calvix
Copy link

calvix commented Dec 1, 2017

Is this a BUG REPORT or FEATURE REQUEST?:
/kind bug

What happened:
Kube-proxy get stuck after master goes down and is recreated on a new machine.

We run kube-proxy as daemon set and under normal circumstances kube-proxy works fine.

We run k8s nodes as imutable instances and if there is reboot, stop or error, the node is recreated as whole new machine with new ip,mac and everyhting. Etcd data is stored on persistent storage but OS is not.
K8s API endpoint stays same.

This lead to an issue when the master is "recreated" then kube-proxy is in some weird stuck state when it doesn't work. We run health checks on the kube-proxy, but this does not trigger any restart as the kube-proxy thinks that its healthy and there is not a single log entry indicating that anything is wrong.
To fix it we need to kill all kube-proxy pods and then it works again.

My wild assumption is that kube-proxy is holding open connection to the k8s-api and if the master is recreated with new ip, kubeproxy is still using the old non-working connection.

What you expected to happen:
Kube-proxy is checking if the current connection to the K8S api is valid in some period of time and if not the it force reconnection.

How to reproduce it (as minimally and precisely as possible):

  • Create running k8s cluster with single master.
  • recreate master with same etcd data and endpoint but different instance ip.
  • Test k8s resource type service (they should not work properly) or create a new one and test that new service.

Anything else we need to know?:

Environment:

  • Kubernetes version (use kubectl version):
Server Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.5+coreos.0", GitCommit:"070d238cd2ec359928548e486a9171b498573181", GitTreeState:"clean", BuildDate:"2017-08-31T21:28:39Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}

but we got similar behavior also on 1.8.1

  • Cloud provider or hardware configuration: baremetal
  • OS (e.g. from /etc/os-release):
NAME="Container Linux by CoreOS"
ID=coreos
VERSION=1465.8.0
VERSION_ID=1465.8.0
BUILD_ID=2017-09-20-2237
PRETTY_NAME="Container Linux by CoreOS 1465.8.0 (Ladybug)"
ANSI_COLOR="38;5;75"
HOME_URL="https://coreos.com/"
BUG_REPORT_URL="https://issues.coreos.com"
COREOS_BOARD="amd64-usr"
  • Kernel (e.g. uname -a):
Linux 00008df14a32b2b9 4.12.14-coreos #1 SMP Wed Sep 20 22:20:05 UTC 2017 x86_64 Intel(R) Xeon(R) CPU E5-2637 v2 @ 3.50GHz GenuineIntel GNU/Linux
  • Install tools: selfhosted
  • Others:

@kubernetes/sig-network-bugs

@k8s-ci-robot k8s-ci-robot added sig/network Categorizes an issue or PR as relevant to SIG Network. kind/bug Categorizes issue or PR as related to a bug. labels Dec 1, 2017
@k8s-ci-robot
Copy link
Contributor

@calvix: Reiterating the mentions to trigger a notification:
@kubernetes/sig-network-bugs

In response to this:

Is this a BUG REPORT or FEATURE REQUEST?:
/kind bug

What happened:
Kube-proxy get stuck after master goes down and is recreated on a new machine.

We run kube-proxy as daemon set and under normal circumstances kube-proxy works fine.

We run k8s nodes as imutable instances and if there is reboot, stop or error, the node is recreated as whole new machine with new ip,mac and everyhting. Etcd data is stored on persistent storage but OS is not.
K8s API endpoint stays same.

This lead to an issue when the master is "recreated" then kube-proxy is in some weird stuck state when it doesn't work. We run health checks on the kube-proxy, but this does not trigger any restart as the kube-proxy thinks that its healthy and there is not a single log entry indicating that anything is wrong.
To fix it we need to kill all kube-proxy pods and then it works again.

My wild assumption is that kube-proxy is holding open connection to the k8s-api and if the master is recreated with new ip, kubeproxy is still using the old non-working connection.

What you expected to happen:
Kube-proxy is checking if the current connection to the K8S api is valid in some period of time and if not the it force reconnection.

How to reproduce it (as minimally and precisely as possible):

  • Create running k8s cluster with single master.
  • recreate master with same etcd data and endpoint but different instance ip.
  • Test k8s resource type service (they should not work properly) or create a new one and test that new service.

Anything else we need to know?:

Environment:

  • Kubernetes version (use kubectl version):
Server Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.5+coreos.0", GitCommit:"070d238cd2ec359928548e486a9171b498573181", GitTreeState:"clean", BuildDate:"2017-08-31T21:28:39Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}

but we got similar behavior also on 1.8.1

  • Cloud provider or hardware configuration: baremetal
  • OS (e.g. from /etc/os-release):
NAME="Container Linux by CoreOS"
ID=coreos
VERSION=1465.8.0
VERSION_ID=1465.8.0
BUILD_ID=2017-09-20-2237
PRETTY_NAME="Container Linux by CoreOS 1465.8.0 (Ladybug)"
ANSI_COLOR="38;5;75"
HOME_URL="https://coreos.com/"
BUG_REPORT_URL="https://issues.coreos.com"
COREOS_BOARD="amd64-usr"
  • Kernel (e.g. uname -a):
Linux 00008df14a32b2b9 4.12.14-coreos #1 SMP Wed Sep 20 22:20:05 UTC 2017 x86_64 Intel(R) Xeon(R) CPU E5-2637 v2 @ 3.50GHz GenuineIntel GNU/Linux
  • Install tools: selfhosted
  • Others:

@kubernetes/sig-network-bugs

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@MrHohn
Copy link
Member

MrHohn commented Dec 11, 2017

We run health checks on the kube-proxy, but this does not trigger any restart as the kube-proxy thinks that its healthy and there is not a single log entry indicating that anything is wrong.

Curious how is the kube-proxy health check configured? Agree it is odd that no errors are logged.

@calvix
Copy link
Author

calvix commented Dec 12, 2017

Hello @MrHohn ,

We have this health check:

livenessProbe:
  httpGet:
    path: /healthz
    port: 10256
    initialDelaySeconds: 10
    periodSeconds: 3

@thockin
Copy link
Member

thockin commented Jan 6, 2018

kube-proxy usually talks to the master directly by IP, so if you change teh master IP, kube-proxy gets lost?

@r7vme
Copy link

r7vme commented Jan 8, 2018

kube-proxy usually talks to the master directly by IP

interesting, we are using --kubeconfig= flag, where we supply "external" endpoint to kubernetes API. So it connects to smth like https://api.foo.bar . Which is load balancer VIP.

In general, i think this issue related to the fact that we are hardly killing master VM and some TCP connections stuck in ESTABLISCHED state. We had similar problems with Calico (projectcalico/calico#1420) and own node-controller (uses pure client-go library)

If VM that serves k8s api got killed TCP connection open on client side until client drops it (for me it was 10-17 minutes). Even tcp keepalive (which enabled by default in client-go/golang-net) does not help. Finally only solution for node-contoller (app that uses client-go) was to restart pod by kubernetes if it can not connect to api. Here is my question in client-go. I hope someone will answer it at some point, as i think this is mostly a cause of similar issues.

Similar k8s issue that also can be related (i've attached a lot of tcpdumps there).

P.S. me and @calvix are from the same company.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 8, 2018
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 8, 2018
@calvix
Copy link
Author

calvix commented May 9, 2018

The issue is not fixed, but we have workarounds in place that helps us mitigate the issue.

@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. sig/network Categorizes an issue or PR as relevant to SIG Network.
Projects
None yet
Development

No branches or pull requests

6 participants