-
Notifications
You must be signed in to change notification settings - Fork 594
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
fix: better clean up of file handles #2823
Conversation
The elf-binary-package-cataloger does its own file IO to account for the possibility of a logical ELF package being broken across multiple physical files. However, this casued it to skip the normal invocation pattern in the generic cataloger code that prevented file leaks. Ensure this cataloger always closes its file handles. Signed-off-by: Will Murphy <will.murphy@anchore.com>
Otherwise, a panicking cataloger could leak file handles. Signed-off-by: Will Murphy <will.murphy@anchore.com>
|
||
log.WithFields("path", location.RealPath).Trace("parsing file contents") | ||
invokeParser := func(location file.Location, parser Parser) ([]pkg.Package, []artifact.Relationship, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could this be a static function over a closure? it would be simpler to reason about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. It will have a lot more parameters (env and logger), but it will be easier to reason about, so that's probably the right trade off. I'll try to push up a rev.
Signed-off-by: Will Murphy <will.murphy@anchore.com>
Signed-off-by: Will Murphy <will.murphy@anchore.com>
@wagoodman / @spiffcs - refactored and added a unit test for not leaking file on generic cataloger panic. |
Signed-off-by: Will Murphy <will.murphy@anchore.com>
@@ -122,17 +123,9 @@ func (c *Cataloger) Catalog(ctx context.Context, resolver file.Resolver) ([]pkg. | |||
|
|||
log.WithFields("path", location.RealPath).Trace("parsing file contents") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
optional opportunity for unrelated cleanup -- this should be using logger
not log
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
awesome test 🥇
Previously, the ELF binary package cataloger would leak file handles, and the generic cataloger loop would leak file handles if parsing the file panicked. Instead, ensure both situations
defer close
new file readers as soon as they are made.The incorrect behavior had been happening for some time, but was probably exposed by #2814 which caused the ELF binary package cataloger to rely more on file handles and less on in-memory buffers.
Fixes #2819 - in my testing, syft 1.3.0 needs about 1500 concurrent file handles to parse
pytorch/pytorch:latest
; after this change, the number is closer to 800.