Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[nvidia][hsflowd] Fix Dropmon co-operation issues related to HW stop #73

Closed
wants to merge 5 commits into from

Conversation

vivekrnv
Copy link
Owner

@vivekrnv vivekrnv commented Aug 19, 2023

Why I did it

Sflow service stop is causing other dropmon clients to not receive any drops thereafter,

Repro Steps:

config feature state sflow disabled 
config sflow enable
config sflow collector add temp1 192.168.1.10 
	
<Start other dropmon clients>
<Client will be receiving the drops >

config sflow disable <Exit hsflowd process>	
<Other client will stop recieving drops>
Work item tracking
  • Microsoft ADO (number only):

How I did it

  • During process exit, don't stop HW in Drop Mon. for hsflowd. HW Drops are controlled by other daemon in nvidia platform.
  • As for SW drops, only start sw drops in NET_DM when the sw=on is provided in hsflowd.conf
  • Don't log feedcontrolerrors for CONFIG since if feedcontrolerrors > 0, application won't stop sw drops even when it exits. CONFIG can likely fail with -EBUSY if the NET_DM is already configured by another daemon.

How to verify it

  1. Verify the steps and see if the client is receiving the drops, default sw=off case.
Aug 19 00:32:44.212065 r-leopard-41 NOTICE sflow#sflowmgrd: :- sflowHandleService: Starting hsflowd service
Aug 19 00:32:44.212237 r-leopard-41 INFO sflow#hsflowd: started
Aug 19 00:32:44.212282 r-leopard-41 INFO sflow#hsflowd: autoload SONIC and PSAMPLE modules
Aug 19 00:32:44.212282 r-leopard-41 INFO sflow#hsflowd: drop-monitor support for SONiC
Aug 19 00:33:40.446770 r-leopard-41 INFO sflow#hsflowd: dropmon state INIT -> GET_FAMILY
Aug 19 00:33:40.446770 r-leopard-41 INFO sflow#hsflowd: dropmon state GET_FAMILY -> GOT_GROUP
Aug 19 00:33:41.037737 r-leopard-41 INFO sflow#hsflowd: dropmon state GOT_GROUP -> JOIN_GROUP
Aug 19 00:33:42.042191 r-leopard-41 INFO sflow#hsflowd: dropmon state JOIN_GROUP -> CONFIGURE
Aug 19 00:33:42.042191 r-leopard-41 INFO sflow#hsflowd: Configuring DropMon Failed, Module is already in Monitoring State, Continue...
Aug 19 00:33:42.042191 r-leopard-41 INFO sflow#hsflowd: message repeated 2 times: [ Configuring DropMon Failed, Module is already in Monitoring State, Continue...]
Aug 19 00:33:42.042191 r-leopard-41 INFO sflow#hsflowd: dropmon state CONFIGURE -> START
Aug 19 00:34:42.042191 r-leopard-41 INFO sflow#hsflowd: dropmon state START -> RUN
Aug 19 00:35:07.746616 r-leopard-41 INFO sflow#hsflowd: Received SIGTERM
Aug 19 00:35:07.445474 r-leopard-41 INFO sflow#hsflowd: dropmon state RUN -> STOP
Aug 19 00:35:07.795261 r-leopard-41 INFO sflow#hsflowd: stopped
  1. Test the sw=on case
Aug 19 00:55:29.802990 r-leopard-41 INFO sflow#hsflowd: started
Aug 19 00:55:29.802990 r-leopard-41 INFO sflow#hsflowd: autoload SONIC and PSAMPLE modules
Aug 19 00:55:29.802990 r-leopard-41 INFO sflow#hsflowd: drop-monitor support for SONiC
Aug 19 00:55:38.213225 r-leopard-41 INFO sflow#hsflowd: dropmon state INIT -> GET_FAMILY
Aug 19 00:55:38.213225 r-leopard-41 INFO sflow#hsflowd: dropmon state GET_FAMILY -> GOT_GROUP
Aug 19 00:55:38.804221 r-leopard-41 INFO sflow#hsflowd: dropmon state GOT_GROUP -> JOIN_GROUP
Aug 19 00:55:39.808749 r-leopard-41 INFO sflow#hsflowd: dropmon state JOIN_GROUP -> CONFIGURE
Aug 19 00:55:39.808749 r-leopard-41 INFO sflow#hsflowd: Configuring DropMon Failed, Module is already in Monitoring State, Continue...
Aug 19 00:55:40.813554 r-leopard-41 INFO sflow#hsflowd: message repeated 2 times: [ Configuring DropMon Failed, Module is already in Monitoring State, Continue...]
Aug 19 00:55:40.813554 r-leopard-41 INFO sflow#hsflowd: dropmon state CONFIGURE -> START
Aug 19 00:55:42.430937 r-leopard-41 INFO sflow#hsflowd: dropmon state START -> RUN
Aug 19 00:56:36.441267 r-leopard-41 INFO sflow#hsflowd: Received SIGTERM
Aug 19 00:56:36.445474 r-leopard-41 INFO sflow#hsflowd: dropmon: graceful shutdown: turning off feed
Aug 19 00:56:36.445474 r-leopard-41 INFO sflow#hsflowd: dropmon state RUN -> STOP
Aug 19 00:56:36.507932 r-leopard-41 INFO sflow#hsflowd: stopped

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106
  • 202111
  • 202205
  • 202211
  • 202305

Tested branch (Please provide the tested image version)

Description for the changelog

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

@vivekrnv vivekrnv marked this pull request as draft August 21, 2023 20:11
@vivekrnv vivekrnv marked this pull request as ready for review August 22, 2023 01:59
@vivekrnv vivekrnv marked this pull request as draft August 31, 2023 18:07
@vivekrnv vivekrnv closed this Sep 13, 2023
vivekrnv pushed a commit that referenced this pull request Oct 13, 2023
…e latest HEAD automatically (sonic-net#15016)

src/wpasupplicant/sonic-wpa-supplicant

* a24412c25 - (HEAD -> 202205, origin/master, origin/HEAD, origin/202211, origin/202205, master) [mka]: Fix unexpected cleanup (#73) (8 days ago) [Ze Gan]
* 26d1da0bc - [mka]: Fix re-establishment by reset MI (#72) (8 days ago) [Ze Gan]
* f07e0a097 - [azp]: Update build pipeline to build for Bullseye (#70) (4 weeks ago) [Ze Gan]
*   2c69e2cda - Use github code scanning instead of LGTM (#69) (6 months ago) [Liu Shilong]
|\  
| * 23abb04e5 - fix (6 months ago) [shilongliu]
| * f34d68fe6 - libdbus-1-dev (6 months ago) [shilongliu]
| * dc2dd881e - add dbus (6 months ago) [shilongliu]
| * 5de037661 - use swsscommon packages (6 months ago) [shilongliu]
| * 32c5a2729 - Use github code scanning instead of LGTM (6 months ago) [shilongliu]
|/  
* aa731b96f - [azp]: Install libyang in azure pipeline (#68) (8 months ago) [Hua Liu]
* 71b635d74 - Revert "[Azp]: Upgrade Azp to bullseye (#49)" (#66) (9 months ago) [Ze Gan]
* 7aa4e6fa4 - Adding Microsoft SECURITY.MD (#58) (9 months ago) [microsoft-github-policy-service[bot]]
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant