Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[Services] Restart NAT service upon unexpected critical process exit. #4208

Merged
merged 3 commits into from
Mar 5, 2020
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions dockers/docker-nat/Dockerfile.j2
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,8 @@ RUN apt-get update \
COPY ["start.sh", "/usr/bin/"]
COPY ["supervisord.conf", "/etc/supervisor/conf.d/"]
COPY ["restore_nat_entries.py", "/usr/bin/"]
COPY ["files/supervisor-proc-exit-listener", "/usr/bin"]
COPY ["critical_processes", "/etc/supervisor"]

RUN apt-get clean -y; apt-get autoclean -y; apt-get autoremove -y
RUN rm -rf /debs
Expand Down
2 changes: 2 additions & 0 deletions dockers/docker-nat/critical_processes
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
natmgrd
natsyncd
8 changes: 7 additions & 1 deletion dockers/docker-nat/supervisord.conf
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,12 @@ logfile_maxbytes=1MB
logfile_backups=2
nodaemon=true

[eventlistener:supervisor-proc-exit-listener]
command=/usr/bin/supervisor-proc-exit-listener --container-name nat
events=PROCESS_STATE_EXITED
autostart=true
autorestart=unexpected

[program:start.sh]
command=/usr/bin/start.sh
priority=1
Expand All @@ -15,7 +21,7 @@ stderr_logfile=syslog
command=/usr/sbin/rsyslogd -n
priority=2
autostart=false
autorestart=false
autorestart=unexpected
stdout_logfile=syslog
stderr_logfile=syslog

Expand Down
4 changes: 4 additions & 0 deletions files/build_templates/nat.service.j2
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,16 @@ Description=NAT container
Requires=updategraph.service swss.service
After=updategraph.service swss.service syncd.service
Before=ntp-config.service
StartLimitIntervalSec=1200
StartLimitBurst=3

[Service]
User={{ sonicadmin_user }}
ExecStartPre=/usr/bin/{{docker_container_name}}.sh start
ExecStart=/usr/bin/{{docker_container_name}}.sh wait
ExecStop=/usr/bin/{{docker_container_name}}.sh stop
Restart=always
RestartSec=30

[Install]
WantedBy=multi-user.target swss.service
Expand Down
2 changes: 1 addition & 1 deletion rules/docker-nat.mk
Original file line number Diff line number Diff line change
Expand Up @@ -31,4 +31,4 @@ $(DOCKER_NAT)_RUN_OPT += -v /etc/sonic:/etc/sonic:ro
$(DOCKER_NAT)_RUN_OPT += -v /host/warmboot:/var/warmboot

$(DOCKER_NAT)_BASE_IMAGE_FILES += natctl:/usr/bin/natctl

$(DOCKER_NAT)_BASE_IMAGE_FILES += $(SUPERVISOR_PROC_EXIT_LISTENER_SCRIPT)