Nginx pod restarting in a loop after applying configuration changes #13137

astralko · 2025-04-01T10:39:11Z

What happened:

edited the nginx-ingress-controller configmap: changed allow-snippet-annotations: "false" to

allow-snippet-annotations: "true"
annotations-risk-level: Critical
use-forwarded-headers: "true"

afterwards my nginx ingress controller pod got errors while reloading the configuration

I0401 08:32:56.115814 1 event.go:377] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"lordabbett", Name:"tenant-websocket-ingress", UID:"a0a86d94-4323-4cd4-86ac-9aefb443dac7", APIVersion:"networking.k8s.io/v1", ResourceVersion:"1222523732", FieldPath:""}): type: 'Normal' reason: 'Sync' Scheduled for sync
E0401 08:32:56.572590 1 controller.go:211] Unexpected failure reloading the backend:
exit status 1
2025/04/01 08:32:54 [notice] 549#549: signal process started
2025/04/01 08:32:54 [alert] 549#549: kill(34, 1) failed (3: No such process)
nginx: [alert] kill(34, 1) failed (3: No such process)
E0401 08:32:56.572655 1 queue.go:131] "requeuing" err=<
exit status 1
2025/04/01 08:32:54 [notice] 549#549: signal process started
2025/04/01 08:32:54 [alert] 549#549: kill(34, 1) failed (3: No such process)
nginx: [alert] kill(34, 1) failed (3: No such process)

key="tenant/tenant-user-api-service-z9dsc"
I0401 08:32:56.572890 1 event.go:377] Event(v1.ObjectReference{Kind:"Pod", Namespace:"kube-system", Name:"nginx-ingress-controller-8d6857857-ljjf4", UID:"af80e5d9-42cc-4f84-b6c8-b1fc92c723ce", APIVersion:"v1", ResourceVersion:"1222453499", FieldPath:""}): type: 'Warning' reason: 'RELOAD' Error reloading NGINX: exit status 1
2025/04/01 08:32:54 [notice] 549#549: signal process started
2025/04/01 08:32:54 [alert] 549#549: kill(34, 1) failed (3: No such process)
nginx: [alert] kill(34, 1) failed (3: No such process)

it happened over and over, got crashed in a loop.
only fixed by scaling the deployment to 0, and then back to 2

What you expected to happen:

I expected the reload of the config to work without crashing. not sure why scaling down and up again fixed the issue

NGINX Ingress controller version (exec into the pod and run /nginx-ingress-controller --version):

NGINX Ingress controller
Release: 1.12.1
Build: 64780b1
Repository: https://github.com/kubernetes/ingress-nginx
nginx version: nginx/1.25.5

Kubernetes version (use kubectl version):

Environment:

Cloud provider or hardware configuration: AWS
OS (e.g. from /etc/os-release): Debian GNU/Linux 12 (bookworm)
Kernel (e.g. uname -a): Linux nginx-ingress-controller-79bb8d944d-kr6ld 5.10.233-223.887.amzn2.x86_64 Basic structure #1 SMP Sat Jan 11 16:55:02 UTC 2025 x86_64 GNU/Linux
Install tools: helm
Basic cluster related info:
- kubectl version:
  Client Version: v1.28.8-eks-ae9a62a
  Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
  Server Version: v1.31.6-eks-bc803b4
- kubectl get nodes -o wide
  NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
  ip-10-10-33-119.us-east-2.compute.internal Ready 92m v1.31.4-eks-aeac579 10.10.33.119 Amazon Linux 2 5.10.233-223.887.amzn2.x86_64 containerd://1.7.25
  ip-10-10-33-125.us-east-2.compute.internal Ready 6d19h v1.31.4-eks-aeac579 10.10.33.125 Amazon Linux 2 5.10.233-223.887.amzn2.x86_64 containerd://1.7.25
How was the ingress-nginx-controller installed:
- If helm was used then please show output of helm ls -A | grep -i ingress
  nginx-ingress-controller kube-system 3 2025-03-30 09:41:32.702978 +0300 IDT deployed nginx-ingress-controller-11.6.12 1.12.1
- If helm was used then please show output of helm -n <ingresscontrollernamespace> get values <helmreleasename>
  USER-SUPPLIED VALUES:
  defaultBackend:
  image:
  registry: xxxxxxxx.dkr.ecr.us-east-2.amazonaws.com
  repository: gytpol-nginx-ingress-controller
  tag: 1.27.4-debian-12-r7
  global:
  security:
  allowInsecureImages: true
  image:
  registry: xxxxxxxx.dkr.ecr.us-east-2.amazonaws.com
  repository: gytpol-nginx-ingress-controller
  tag: 1.12.1-debian-12-r0

(these image were pulled from
docker pull --platform linux/amd64 docker.io/bitnami/nginx-ingress-controller:1.12.1-debian-12-r0
docker pull --platform linux/amd64 docker.io/bitnami/nginx:1.27.4-debian-12-r7
)

Current State of the controller:
- kubectl describe ingressclasses
  Name: alb
  Labels: app.kubernetes.io/instance=aws-load-balancer-controller
  app.kubernetes.io/managed-by=Helm
  app.kubernetes.io/name=aws-load-balancer-controller
  app.kubernetes.io/version=v2.4.2
  helm.sh/chart=aws-load-balancer-controller-1.4.3
  Annotations: meta.helm.sh/release-name: aws-load-balancer-controller
  meta.helm.sh/release-namespace: kube-system
  Controller: ingress.k8s.aws/alb
  Events:

Name: nginx
Labels: app.kubernetes.io/component=controller
app.kubernetes.io/instance=nginx-ingress-controller
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=nginx-ingress-controller
app.kubernetes.io/version=1.12.1
helm.sh/chart=nginx-ingress-controller-11.6.12
Annotations: meta.helm.sh/release-name: nginx-ingress-controller
meta.helm.sh/release-namespace: kube-system
Controller: k8s.io/ingress-nginx
Events:

Others:
- Any other related information like ;
  we upgraded to this version from a much older one from 2022

Anything else we need to know:

The text was updated successfully, but these errors were encountered:

longwuyuan · 2025-04-01T19:58:22Z

/remove-kind bug
/kind support
/triage needs-information

Please reproduce using a kind cluster

Gacko · 2025-04-02T12:08:45Z

I assume you're already having an Ingress resource that uses snippet annotations, but they were simply ignored before enabling it. So now, that they are enabled, the Ingress NGINX Controller renders them into your NGINX configuration. But since we can hardly validate such snippet annotations, it might be the case that a mistake in the snippet annotation provided by you makes the underlying NGINX crash now.

strongjz · 2025-04-02T13:28:33Z

@Gacko is likely correct. There is another difference that I'd like to point out as well,
docker.io/bitnami/nginx-ingress-controller:1.12.1-debian-12-r0

Are you using the bitnami chart to deploy as well? Or just their images and this projects chart?

acfmarcelo · 2025-04-02T13:53:34Z

@strongjz I'm using bitnami helm chart and got same error. Maybe something wrong with bitnami chart?

(combined from similar events): Error reloading NGINX: exit status 1 2025/04/01 20:26:03 [notice] 1060#1060: signal process started 2025/04/01 20:26:03 [alert] 1060#1060: kill(29, 1) failed (3: No such process) nginx: [alert] kill(29, 1) failed (3: No such process)

Gacko · 2025-04-02T15:17:46Z

Can you please use our Helm chart and controller image? We cannot provide support for 3rd party Helm charts or controller images.

astralko added the kind/bug Categorizes issue or PR as related to a bug. label Apr 1, 2025

k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority labels Apr 1, 2025

strongjz added this to [SIG Network] Ingress NGINX Apr 1, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nginx pod restarting in a loop after applying configuration changes #13137

Nginx pod restarting in a loop after applying configuration changes #13137

astralko commented Apr 1, 2025

longwuyuan commented Apr 1, 2025

Gacko commented Apr 2, 2025

strongjz commented Apr 2, 2025

acfmarcelo commented Apr 2, 2025

Gacko commented Apr 2, 2025 •

edited

Loading

Nginx pod restarting in a loop after applying configuration changes #13137

Nginx pod restarting in a loop after applying configuration changes #13137

Comments

astralko commented Apr 1, 2025

longwuyuan commented Apr 1, 2025

Gacko commented Apr 2, 2025

strongjz commented Apr 2, 2025

acfmarcelo commented Apr 2, 2025

Gacko commented Apr 2, 2025 • edited Loading

Gacko commented Apr 2, 2025 •

edited

Loading