Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Nginx pod restarting in a loop after applying configuration changes #13137

Open
astralko opened this issue Apr 1, 2025 · 5 comments
Open

Nginx pod restarting in a loop after applying configuration changes #13137

astralko opened this issue Apr 1, 2025 · 5 comments
Labels
kind/support Categorizes issue or PR as a support question. needs-priority triage/needs-information Indicates an issue needs more information in order to work on it.

Comments

@astralko
Copy link

astralko commented Apr 1, 2025

What happened:

edited the nginx-ingress-controller configmap: changed allow-snippet-annotations: "false" to

allow-snippet-annotations: "true"
annotations-risk-level: Critical
use-forwarded-headers: "true"

afterwards my nginx ingress controller pod got errors while reloading the configuration

I0401 08:32:56.115814 1 event.go:377] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"lordabbett", Name:"tenant-websocket-ingress", UID:"a0a86d94-4323-4cd4-86ac-9aefb443dac7", APIVersion:"networking.k8s.io/v1", ResourceVersion:"1222523732", FieldPath:""}): type: 'Normal' reason: 'Sync' Scheduled for sync
E0401 08:32:56.572590 1 controller.go:211] Unexpected failure reloading the backend:
exit status 1
2025/04/01 08:32:54 [notice] 549#549: signal process started
2025/04/01 08:32:54 [alert] 549#549: kill(34, 1) failed (3: No such process)
nginx: [alert] kill(34, 1) failed (3: No such process)
E0401 08:32:56.572655 1 queue.go:131] "requeuing" err=<
exit status 1
2025/04/01 08:32:54 [notice] 549#549: signal process started
2025/04/01 08:32:54 [alert] 549#549: kill(34, 1) failed (3: No such process)
nginx: [alert] kill(34, 1) failed (3: No such process)

key="tenant/tenant-user-api-service-z9dsc"
I0401 08:32:56.572890 1 event.go:377] Event(v1.ObjectReference{Kind:"Pod", Namespace:"kube-system", Name:"nginx-ingress-controller-8d6857857-ljjf4", UID:"af80e5d9-42cc-4f84-b6c8-b1fc92c723ce", APIVersion:"v1", ResourceVersion:"1222453499", FieldPath:""}): type: 'Warning' reason: 'RELOAD' Error reloading NGINX: exit status 1
2025/04/01 08:32:54 [notice] 549#549: signal process started
2025/04/01 08:32:54 [alert] 549#549: kill(34, 1) failed (3: No such process)
nginx: [alert] kill(34, 1) failed (3: No such process)

it happened over and over, got crashed in a loop.
only fixed by scaling the deployment to 0, and then back to 2

What you expected to happen:

I expected the reload of the config to work without crashing. not sure why scaling down and up again fixed the issue

NGINX Ingress controller version (exec into the pod and run /nginx-ingress-controller --version):


NGINX Ingress controller
Release: 1.12.1
Build: 64780b1
Repository: https://github.com/kubernetes/ingress-nginx
nginx version: nginx/1.25.5


Kubernetes version (use kubectl version):

Environment:

  • Cloud provider or hardware configuration: AWS
  • OS (e.g. from /etc/os-release): Debian GNU/Linux 12 (bookworm)
  • Kernel (e.g. uname -a): Linux nginx-ingress-controller-79bb8d944d-kr6ld 5.10.233-223.887.amzn2.x86_64 Basic structure  #1 SMP Sat Jan 11 16:55:02 UTC 2025 x86_64 GNU/Linux
  • Install tools: helm
  • Basic cluster related info:
    • kubectl version:
      Client Version: v1.28.8-eks-ae9a62a
      Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
      Server Version: v1.31.6-eks-bc803b4
    • kubectl get nodes -o wide
      NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
      ip-10-10-33-119.us-east-2.compute.internal Ready 92m v1.31.4-eks-aeac579 10.10.33.119 Amazon Linux 2 5.10.233-223.887.amzn2.x86_64 containerd://1.7.25
      ip-10-10-33-125.us-east-2.compute.internal Ready 6d19h v1.31.4-eks-aeac579 10.10.33.125 Amazon Linux 2 5.10.233-223.887.amzn2.x86_64 containerd://1.7.25
  • How was the ingress-nginx-controller installed:
    • If helm was used then please show output of helm ls -A | grep -i ingress
      nginx-ingress-controller kube-system 3 2025-03-30 09:41:32.702978 +0300 IDT deployed nginx-ingress-controller-11.6.12 1.12.1
    • If helm was used then please show output of helm -n <ingresscontrollernamespace> get values <helmreleasename>
      USER-SUPPLIED VALUES:
      defaultBackend:
      image:
      registry: xxxxxxxx.dkr.ecr.us-east-2.amazonaws.com
      repository: gytpol-nginx-ingress-controller
      tag: 1.27.4-debian-12-r7
      global:
      security:
      allowInsecureImages: true
      image:
      registry: xxxxxxxx.dkr.ecr.us-east-2.amazonaws.com
      repository: gytpol-nginx-ingress-controller
      tag: 1.12.1-debian-12-r0

(these image were pulled from
docker pull --platform linux/amd64 docker.io/bitnami/nginx-ingress-controller:1.12.1-debian-12-r0
docker pull --platform linux/amd64 docker.io/bitnami/nginx:1.27.4-debian-12-r7
)

  • Current State of the controller:
    • kubectl describe ingressclasses
      Name: alb
      Labels: app.kubernetes.io/instance=aws-load-balancer-controller
      app.kubernetes.io/managed-by=Helm
      app.kubernetes.io/name=aws-load-balancer-controller
      app.kubernetes.io/version=v2.4.2
      helm.sh/chart=aws-load-balancer-controller-1.4.3
      Annotations: meta.helm.sh/release-name: aws-load-balancer-controller
      meta.helm.sh/release-namespace: kube-system
      Controller: ingress.k8s.aws/alb
      Events:

Name: nginx
Labels: app.kubernetes.io/component=controller
app.kubernetes.io/instance=nginx-ingress-controller
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=nginx-ingress-controller
app.kubernetes.io/version=1.12.1
helm.sh/chart=nginx-ingress-controller-11.6.12
Annotations: meta.helm.sh/release-name: nginx-ingress-controller
meta.helm.sh/release-namespace: kube-system
Controller: k8s.io/ingress-nginx
Events:

  • Others:
    • Any other related information like ;
      we upgraded to this version from a much older one from 2022

Anything else we need to know:

@astralko astralko added the kind/bug Categorizes issue or PR as related to a bug. label Apr 1, 2025
@k8s-ci-robot k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority labels Apr 1, 2025
@longwuyuan
Copy link
Contributor

/remove-kind bug
/kind support
/triage needs-information

Please reproduce using a kind cluster

@k8s-ci-robot k8s-ci-robot added kind/support Categorizes issue or PR as a support question. triage/needs-information Indicates an issue needs more information in order to work on it. and removed kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Apr 1, 2025
@Gacko
Copy link
Member

Gacko commented Apr 2, 2025

I assume you're already having an Ingress resource that uses snippet annotations, but they were simply ignored before enabling it. So now, that they are enabled, the Ingress NGINX Controller renders them into your NGINX configuration. But since we can hardly validate such snippet annotations, it might be the case that a mistake in the snippet annotation provided by you makes the underlying NGINX crash now.

@strongjz
Copy link
Member

strongjz commented Apr 2, 2025

@Gacko is likely correct. There is another difference that I'd like to point out as well,
docker.io/bitnami/nginx-ingress-controller:1.12.1-debian-12-r0

Are you using the bitnami chart to deploy as well? Or just their images and this projects chart?

@acfmarcelo
Copy link

@strongjz I'm using bitnami helm chart and got same error. Maybe something wrong with bitnami chart?

(combined from similar events): Error reloading NGINX: exit status 1 2025/04/01 20:26:03 [notice] 1060#1060: signal process started 2025/04/01 20:26:03 [alert] 1060#1060: kill(29, 1) failed (3: No such process) nginx: [alert] kill(29, 1) failed (3: No such process)

Image

@Gacko
Copy link
Member

Gacko commented Apr 2, 2025

Can you please use our Helm chart and controller image? We cannot provide support for 3rd party Helm charts or controller images.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
kind/support Categorizes issue or PR as a support question. needs-priority triage/needs-information Indicates an issue needs more information in order to work on it.
Projects
Development

No branches or pull requests

6 participants