Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Add HLD for Orchagent error handling improvements #1698

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

prabhataravind
Copy link

This HLD change attempts to address the following:

  • Handle all ASIC/SAI programming errors gracefully without causing orchagent to crash or restart
  • Detect missed notifications from APP_DB to orchagent in SONiC systems that use redis-based communication channels
  • Detect out-of-sync entries between APP_DB and ASIC_DB

@prabhataravind prabhataravind marked this pull request as ready for review June 24, 2024 00:37
@zhangyanzhao
Copy link
Collaborator

@zhangyanzhao
Copy link
Collaborator

Please leave comments if you want to be a reviewer of this HLD. Thanks.

@zhangyanzhao
Copy link
Collaborator

@prabhataravind can you please add the code PRs by referring to #806? Thanks.

@zhangyanzhao
Copy link
Collaborator

HLD PR is not merged, no code PR. Move to backlog

Signed-off-by: Prabhat Aravind <paravind@microsoft.com>
@mssonicbld
Copy link
Collaborator

/azp run

Copy link

No pipelines are associated with this pull request.

Signed-off-by: Prabhat Aravind <paravind@microsoft.com>
@mssonicbld
Copy link
Collaborator

/azp run

Copy link

No pipelines are associated with this pull request.

![sai status handling](images/sai_status_handling.png)

It is to be noted that some combinations in the table above are not valid scenarios like for example: SAI_STATUS_INSUFFICIENT_RESOURCES when removing an object or SAI_STATUS_ITEM_NOT_FOUND when creating an object. They are however mentioned for completeness.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a section for Bulk API failure handling.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check for bulk stats API failures

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
Status: MovedToBacklog
Status: 📋 In Plan Features
Development

Successfully merging this pull request may close these issues.

5 participants