Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

libgit2: dispose connections in SubTransport.Close #775

Merged
merged 1 commit into from
Jun 9, 2022

Conversation

pjbgf
Copy link
Member

@pjbgf pjbgf commented Jun 8, 2022

The average SubTransport lifecycle encompass two Actions calls. Previously,
it was attempted to share the same connection across both calls. That did
not work as some Git Servers do not support multiple sessions from the same
connection. The implementation was not fully transitioned into the
"one connection per action" model, which led to connections being leaked.

The transition to RW mutex was to avoid the unnecessary blocking in the
goroutine at the start of the second action call.

It is worth mentioning that now when the context is done, the client level
resources (connection) will also be freed. This ensures that SSH connections
will not outlive the subtransport.

Relates to fluxcd/image-automation-controller#334

The average SubTransport lifecycle encompass two Actions calls. Previously,
it was attempted to share the same connection across both calls. That did
not work as some Git Servers do not support multiple sessions from the same
connection. The implementation was not fully transitioned into the
"one connection per action" model, which led to connection being leaked.

The transition to RW mutex was to avoid the unnecessary blocking in the
goroutine at the start of the second action call.

It is worth mentioning that now when the context is done,  the client level
resources (connection) will also be freed. This ensures that SSH connections
will not outlive the subtransport.

Signed-off-by: Paulo Gomes <paulo.gomes@weave.works>
@pjbgf pjbgf added the area/git Git related issues and pull requests label Jun 8, 2022
@pjbgf pjbgf added this to the GA milestone Jun 8, 2022
@pjbgf pjbgf mentioned this pull request Jun 8, 2022
@pjbgf
Copy link
Member Author

pjbgf commented Jun 8, 2022

Testing against IAC the goroutine numbers kept at low (75-100) and healthy levels (as opposed to the previous 1500+):
image

Copy link
Contributor

@darkowlzz darkowlzz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!
Thanks.

@pjbgf pjbgf merged commit 1faa547 into fluxcd:main Jun 9, 2022
@pjbgf pjbgf deleted the leak-conns branch June 9, 2022 07:54
@kallaics
Copy link

Hi @pjbgf ,
could you please release this fix quickly our system cannot handle the lot of connections and our Git is drop all of the connections 2-3 times per day. We are on 031.1 currently, but as I see well, unfortunately this fix is not included.
Csaba

@pjbgf
Copy link
Member Author

pjbgf commented Jun 13, 2022

@kallaics we have a release candidate from Friday with this fix and the logging improvements (#778):

ghcr.io/fluxcd/source-controller:rc-b877bc21
ghcr.io/fluxcd/image-automation-controller:rc-843074dd

@kallaics
Copy link

@pjbgf The patched images are working well, thanks.

@pjbgf pjbgf mentioned this pull request Jun 16, 2022
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
area/git Git related issues and pull requests
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants