Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

notification: Fix memory leak of pending substreams #296

Merged
merged 4 commits into from
Dec 3, 2024

Conversation

lexnv
Copy link
Collaborator

@lexnv lexnv commented Dec 2, 2024

This PR fixes a subtle memory leak that can happen in the following edge-case situation:

  • connection is established and substream outbound is initiated with remote peer
  • the substream ID is tracked until the substream either completes successfully or fails
  • the connection is closed soon after, leading to no substream events ever being generated

For this edge-cases, we need to remove the tracking of the substream ID when the connection is reported as closed.

This has been detected after running a node for more than 2 days with the following generic metrics PR:

Closes: #295

cc @paritytech/networking

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
@lexnv lexnv added the bug Something isn't working label Dec 2, 2024
@lexnv lexnv self-assigned this Dec 2, 2024
@lexnv lexnv requested a review from dmitry-markin December 2, 2024 09:34
NotificationError::Rejected,
)
.await;
}
// user either initiated an outbound substream or an outbound substream was
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs need to be updated.

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
lexnv added a commit that referenced this pull request Dec 3, 2024
Similar to #296, there is a
possibility of leaking memory in the following edge-case:
- T0: Connection is established and outbound substream is initiated with
peer
  - This maps the substream ID to the request bytes information
- T1: Connection is closed before the service has a chance to report
`TransportEvent::SubstreamOpened` or
`TransportEvent::SubstreamOpenFailure`

In this case, if we connect and immediately disconnect with a request in
flight, we are effectively leaking the request bytes.


Detected by:
- #294


### Dashboard

- We are leaking ~111 requests over 3 days timespan:

<img width="1484" alt="Screenshot 2024-12-03 at 10 41 01"
src="https://github.com/user-attachments/assets/f6701017-4add-4aa1-aee1-e1f8d33d54f3">


cc @paritytech/networking

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
@lexnv lexnv merged commit af19055 into master Dec 3, 2024
8 checks passed
@lexnv lexnv deleted the lexnv/pending-outbound-fix branch December 3, 2024 14:07
lexnv added a commit that referenced this pull request Dec 3, 2024
## [0.8.3] - 2024-12-03

This release includes two fixes for small memory leaks on edge-cases in
the notification and request-response protocols.

### Fixed

- req-resp: Fix memory leak of pending substreams
([#297](#297))
- notification: Fix memory leak of pending substreams
([#296](#296))

cc @paritytech/networking

---------

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Co-authored-by: Bastian Köcher <git@kchr.de>
github-merge-queue bot pushed a commit to paritytech/polkadot-sdk that referenced this pull request Dec 4, 2024
## [0.8.3] - 2024-12-03

This release includes two fixes for small memory leaks on edge-cases in
the notification and request-response protocols.

### Fixed

- req-resp: Fix memory leak of pending substreams
([#297](paritytech/litep2p#297))
- notification: Fix memory leak of pending substreams
([#296](paritytech/litep2p#296))

cc @paritytech/networking

---------

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
lexnv added a commit to paritytech/polkadot-sdk that referenced this pull request Dec 4, 2024
## [0.8.3] - 2024-12-03

This release includes two fixes for small memory leaks on edge-cases in
the notification and request-response protocols.

### Fixed

- req-resp: Fix memory leak of pending substreams
([#297](paritytech/litep2p#297))
- notification: Fix memory leak of pending substreams
([#296](paritytech/litep2p#296))

cc @paritytech/networking

---------

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Krayt78 pushed a commit to Krayt78/polkadot-sdk that referenced this pull request Dec 18, 2024
## [0.8.3] - 2024-12-03

This release includes two fixes for small memory leaks on edge-cases in
the notification and request-response protocols.

### Fixed

- req-resp: Fix memory leak of pending substreams
([paritytech#297](paritytech/litep2p#297))
- notification: Fix memory leak of pending substreams
([paritytech#296](paritytech/litep2p#296))

cc @paritytech/networking

---------

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
dudo50 pushed a commit to paraspell-research/polkadot-sdk that referenced this pull request Jan 4, 2025
## [0.8.3] - 2024-12-03

This release includes two fixes for small memory leaks on edge-cases in
the notification and request-response protocols.

### Fixed

- req-resp: Fix memory leak of pending substreams
([paritytech#297](paritytech/litep2p#297))
- notification: Fix memory leak of pending substreams
([paritytech#296](paritytech/litep2p#296))

cc @paritytech/networking

---------

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

notifications: Investigate possible memory leak with pending outbound substream tracking
3 participants