You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Virtual Kubelet restarts for some exception(will open new issue for this) and during the startup of the Virtual Kubelet we see some pods under some namespaces are getting deleted in remote cluster. Looks like some race condition.
Please note: Not all pods under affected namespace is deleted.
For example below are the logs of virtual kubelet and grep for pod name(pod-123) under namespace(namespace-123)
I0930 08:18:19.923187 1 reflector.go:317] Pod fallback reflection not yet completely initialized (item: "namespace-123/pod-123")
I0930 08:18:19.924729 1 reflector.go:317] ServiceAccount fallback reflection not yet completely initialized (item: "namespace-123/pod-123")
I0930 08:18:23.308783 1 pod.go:341] Pod "namespace-123/pod-123" successfully marked as Failed (OffloadingAborted)
I0930 08:18:23.700397 1 reflector.go:327] ServiceAccount reflection not yet completely initialized for local namespace "namespace-123" (item: "pod-123")
I0930 08:18:24.013970 1 reflector.go:327] ServiceAccount reflection not yet completely initialized for local namespace "namespace-123" (item: "pod-123")
I0930 08:18:25.502650 1 reflector.go:327] Pod reflection not yet completely initialized for local namespace "namespace-123" (item: "pod-123")
I0930 08:18:42.109607 1 secret.go:102] Skipping reflection of remote Secret "namespace-123/pod-123-token" as containing service account tokens
I0930 08:18:44.904709 1 podns.go:195] Deleting remote shadowpod "namespace-123/pod-123", since local pod "namespace-123/pod-123" has been previously rejected
I0930 08:18:44.913469 1 namespaced.go:97] Remote ShadowPod "namespace-123/pod-123" successfully deleted
I0930 08:18:44.949221 1 podns.go:199] Skipping reflection of local pod "namespace-123/pod-123" as previously rejected
I0930 08:18:47.825376 1 podns.go:199] Skipping reflection of local pod "namespace-123/pod-123" as previously rejected
I0930 08:18:48.713175 1 podns.go:199] Skipping reflection of local pod "namespace-123/pod-123" as previously rejected
I0930 08:18:48.727150 1 podns.go:199] Skipping reflection of local pod "namespace-123/pod-123" as previously rejected
I0930 08:18:48.738589 1 podns.go:199] Skipping reflection of local pod "namespace-123/pod-123" as previously rejected
Above flow happens only when virtual kubelet pod is restarts. We confirmed by comparing pod restart time vs pod delete time.
What you expected to happen:
No remote pod managed by liqo should get deleted.
How to reproduce it (as minimally and precisely as possible):
It's difficult to reproduce.
Anything else we need to know?:
Note: We have ~200 namespace offloaded to remote cluster and we have ~1500 pods reflected to remote cluster. I'm not sure if scale of namespace and pod what we have is creating the problem.
Environment:
Liqo version: v0.10.1
Liqoctl version: v0.10.1
Kubernetes version (use kubectl version): 1.27
Cloud provider or hardware configuration: Kubeadm
Node image:
Network plugin and version:
Install tools:
Others:
The text was updated successfully, but these errors were encountered:
What happened:
Virtual Kubelet restarts for some exception(will open new issue for this) and during the startup of the Virtual Kubelet we see some pods under some namespaces are getting deleted in remote cluster. Looks like some race condition.
Please note: Not all pods under affected namespace is deleted.
For example below are the logs of virtual kubelet and grep for pod name(pod-123) under namespace(namespace-123)
In
reflector.go
for some reason it was not able to find namespacenamespace-123
and it started printingFailed to retrieve
https://github.com/liqotech/liqo/blob/v0.10.1/pkg/virtualKubelet/reflection/generic/reflector.go#L307
Because of that
pod.go
in https://github.com/liqotech/liqo/blob/v0.10.1/pkg/virtualKubelet/reflection/workload/pod.go#L334 marked the local pod asFailed
As local pod is marked as
Failed
podns.go
in https://github.com/liqotech/liqo/blob/v0.10.1/pkg/virtualKubelet/reflection/workload/podns.go#L191 deleted theShadowPod
which in turn deleted thePod
in Remote cluster.Above flow happens only when virtual kubelet pod is restarts. We confirmed by comparing pod restart time vs pod delete time.
What you expected to happen:
No remote pod managed by liqo should get deleted.
How to reproduce it (as minimally and precisely as possible):
It's difficult to reproduce.
Anything else we need to know?:
Note: We have ~200 namespace offloaded to remote cluster and we have ~1500 pods reflected to remote cluster. I'm not sure if scale of namespace and pod what we have is creating the problem.
Environment:
kubectl version
): 1.27The text was updated successfully, but these errors were encountered: