Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[warm-reboot]: Neighbor entry is not restored after warm-reboot #3108

Closed
volodymyrsamotiy opened this issue Jul 2, 2019 · 2 comments
Closed
Assignees
Labels

Comments

@volodymyrsamotiy
Copy link
Collaborator

volodymyrsamotiy commented Jul 2, 2019

Description

Valid and reachable neighbor entry is not restored after warm-reboot.
Before issuing warm-reboot command, neighbor is REACHABLE in Linux and also it is programmed in HW.

root@sonic:/home/admin# ip neigh show to 30.1.10.101
30.1.10.101 dev Vlan3001 lladdr 24:8a:07:9c:86:02 REACHABLE

Then, after warm-reboot procedure when switch boots up this neighbor is not present in HW and it is marked as FAILED in Linux.

root@sonic:/home/admin# warm-reboot
...
root@sonic:/home/admin# ip neigh show to 30.1.10.101
30.1.10.101 dev Vlan3001 lladdr 24:8a:07:9c:86:02 FAILED

It looks like neighbor was removed for some reason during reconciliation after warm-reboot.
Below is the log snippet with "delete" messages.

INFO swss#supervisord: restore_neighbors restore_neighbors service is started
INFO swss#supervisord: restore_neighbors restore_neighbor service is done for system warmreboot
NOTICE swss#neighsyncd: :- isNeighRestoreDone: neighbor table restore to kernelis done
INFO swss#supervisord: neighsyncd Listens to neigh messages...
NOTICE swss#neighsyncd: :- insertToMap: NEIGH_TABLE, delete key: Vlan3001:30.1.10.101,
NOTICE swss#neighsyncd: :- reconcile: NEIGH_TABLE STALE/DELETE, key: Vlan3001:30.1.10.101, neigh:24:8a:07:9c:86:02, family:IPv4, cache-state:DELETE,
NOTICE swss#orchagent: :- removeNeighbor: Removed next hop 30.1.10.101 on Vlan3001
NOTICE swss#orchagent: :- removeNeighbor: Removed neighbor 24:8a:07:9c:86:02 onVlan3001

Steps to reproduce the issue:

  1. Configure neighbor entry using below configuration.
{
    "VLAN": {
        "Vlan3001": {
            "vlanid": 3001
        }
    },
    "VLAN_MEMBER": {
        "Vlan3001|Ethernet8": {
            "tagging_mode": "tagged"
        }
    },
    "VLAN_INTERFACE": {
        "Vlan3001": {},
        "Vlan3001|30.1.10.1/24": {}
    },
    "NEIGH": {
        "Vlan3001|30.1.10.101": {
            "family": "IPv4"
        }
    }
}
  1. Verify that neighbor is reachable.
  2. Execute warm-reboot command.
  3. Wait until switch finishes restoration after warm-reboot.
  4. Observe that neighbor is not reachable.

Describe the results you received:
Reachable neighbor entry is not present after warm-reboot.

Describe the results you expected:
Neighbor should be restored after warm-reboot.

Additional information you deem important (e.g. issue happens only occasionally):

root@sonic:/home/admin# show version

SONiC Software Version: SONiC.HEAD.19-8c3fdfd0
Distribution: Debian 9.9
Kernel: 4.9.0-9-2-amd64
Build commit: 8c3fdfd0
Build date: Sat Jun 29 07:28:14 UTC 2019
Built by: johnar@jenkins-worker-4

Platform: x86_64-mlnx_msn2700-r0
HwSKU: ACS-MSN2700
ASIC: mellanox
Serial Number: MT1822K07823
Uptime: 12:25:18 up  1:13,  2 users,  load average: 3.40, 3.51, 3.60

Docker images:
REPOSITORY                 TAG                 IMAGE ID            SIZE
docker-syncd-mlnx          HEAD.19-8c3fdfd0    64d5cb77da05        369MB
docker-syncd-mlnx          latest              64d5cb77da05        369MB
docker-lldp-sv2            HEAD.19-8c3fdfd0    f8486c5aeb69        299MB
docker-lldp-sv2            latest              f8486c5aeb69        299MB
docker-dhcp-relay          HEAD.19-8c3fdfd0    0e2d0fa51c81        288MB
docker-dhcp-relay          latest              0e2d0fa51c81        288MB
docker-database            HEAD.19-8c3fdfd0    2fc55bdfa038        280MB
docker-database            latest              2fc55bdfa038        280MB
docker-snmp-sv2            HEAD.19-8c3fdfd0    797c740bae2c        313MB
docker-snmp-sv2            latest              797c740bae2c        313MB
docker-orchagent           HEAD.19-8c3fdfd0    68d9ed9b22a4        319MB
docker-orchagent           latest              68d9ed9b22a4        319MB
docker-teamd               HEAD.19-8c3fdfd0    2b78121ef284        301MB
docker-teamd               latest              2b78121ef284        301MB
docker-sonic-telemetry     HEAD.19-8c3fdfd0    6b14465f032d        302MB
docker-sonic-telemetry     latest              6b14465f032d        302MB
docker-router-advertiser   HEAD.19-8c3fdfd0    2619418c8ab1        280MB
docker-router-advertiser   latest              2619418c8ab1        280MB
docker-platform-monitor    HEAD.19-8c3fdfd0    b698c6a6ea2e        394MB
docker-platform-monitor    latest              b698c6a6ea2e        394MB
docker-fpm-frr             HEAD.19-8c3fdfd0    88fcddc62877        319MB
docker-fpm-frr             latest              88fcddc62877        319MB

@yxieca
Copy link
Contributor

yxieca commented Sep 12, 2019

@prsunny your recent change addressed this issue. right?

@prsunny
Copy link
Contributor

prsunny commented Sep 12, 2019

Yes, this can be closed. Fixed by sonic-net/sonic-swss#1040

@prsunny prsunny closed this as completed Sep 12, 2019
mssonicbld added a commit that referenced this issue Mar 28, 2024
…atically (#18240)

#### Why I did it
src/sonic-utilities
```
* bdc57206 - (HEAD -> master, origin/master, origin/HEAD) Revert "Fix for Switch Port Modes and VLAN CLI Enhancement (#3108)" (#3246) (89 minutes ago) [jingwenxie]
* e35452b7 - Modify "show interface transceiver status" CLI to show SW cmis state (#3238) (2 days ago) [mihirpat1]
* 04a33e1f - Add "state" field in CONFIG_DB a toggle of the fabric port monitor feature (#2932) (2 days ago) [jfeng-arista]
* 3c489ba5 - Enhance route-check for multi-asic platforms (#3216) (5 days ago) [Deepak Singhal]
* c149e48b - [chassis] Add chassis support for CLI "config qos reload" (#3233) (6 days ago) [wenyiz2021]
* d8541add - Update port2alias (#3217) (8 days ago) [abdosi]
* d4688a8f - [graceful reboot] Add the pre_reboot_hook script execution, add the watchdog arm before the reboot (#3203) (8 days ago) [Vadym Hlushko]
* 125f36f3 - [ipintutil]Handle exception in show ip interfaces command (#3182) (10 days ago) [Sudharsan Dhamal Gopalarathnam]
* 9d532017 - [chassis][show-runningconfig] Fix the show runningconfiguration all issue on the Supervisor (#3194) (2 weeks ago) [Marty Y. Lok]
* 1a9261ce - [Techsupport]Handle SAI kv pair if present in sai common profile (#3196) (2 weeks ago) [Sudharsan Dhamal Gopalarathnam]
* 7466dc4a - Skip the validation of action in acl-loader if capability table in STATE_DB is empty (#3199) (2 weeks ago) [bingwang-ms]
* b879b658 - [Bug] Fix fw_setenv illegel character issue (#3201) (3 weeks ago) [xumia]
* 0b41a560 - [config] Add YANG alerting for override (#3188) (3 weeks ago) [jingwenxie]
* 24683b0c - [show] multi-asic show running test residue (#3198) (3 weeks ago) [jingwenxie]
* 995a797a - CLI to skip polling for periodic information for a port in DomInfoUpdateTask thread (#3187) (3 weeks ago) [mihirpat1]
* 9aa9eaa5 - [config] Add Table hard dependency check (#3159) (3 weeks ago) [jingwenxie]
* 5f0ffcca - [fast/warm-reboot] Put ERR message in syslog when a failure is seen (#3186) (4 weeks ago) [Vaibhav Hemant Dixit]
* 92220dcf - Fix for Switch Port Modes and VLAN CLI Enhancement (#3108) (4 weeks ago) [Saba Akram]
```
#### How I did it
#### How to verify it
#### Description for the changelog
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants