Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Restart SwSS, syncd and dependent services if a critical process in syncd container exits unexpectedly #3534

Merged
merged 2 commits into from
Nov 9, 2019
Merged

Conversation

jleveque
Copy link
Contributor

- What I did

Restart SwSS, syncd and dependent services if a critical process in the syncd container exits unexpectedly

- How I did it

Add the same mechanism I developed for the SwSS service in #2845 to the syncd service. However, in order to cause the SwSS service to also exit and restart in this situation, I developed a docker-wait-any program which the SwSS service uses to wait for either the swss or syncd containers to exit.

- How to verify it

Run etiher sudo pkill -11 <critical_process_in_syncd_container>, and observe that syncd service exits, swss and all dependent services exit, then all of those services start back up.

@lguohan
Copy link
Collaborator

lguohan commented Nov 9, 2019

retest broadcom please

@lguohan lguohan merged commit 85b0de3 into sonic-net:master Nov 9, 2019
@jleveque jleveque deleted the restart_swss_syncd_crash branch November 9, 2019 20:45
zhenggen-xu pushed a commit to zhenggen-xu/sonic-buildimage that referenced this pull request Jan 10, 2020
…cal process in syncd container exits unexpectedly (sonic-net#3534)

Add the same mechanism I developed for the SwSS service in sonic-net#2845 to the syncd service. However, in order to cause the SwSS service to also exit and restart in this situation, I developed a docker-wait-any program which the SwSS service uses to wait for either the swss or syncd containers to exit.
# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants