-
-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Fix memory leak while parsing improperly terminated inherit
-expressions
#31
Conversation
…ions First of all, big thanks to @fufexan who helped me to reliably reproduce this. Originally discovered in `rnix-lsp`[1], but I confirmed that `nixpkgs-fmt` is also affected. Basically, when having an expression such as let inherit the parser would wait for a `TOKEN_SEMICOLON` indefinitely. The actual problem however is that `self.parse_val()` always detects the SAME syntax-error, i.e. "unexpected EOF". This will be written indefinetely into `self.errors`. However, `errors` is of type `Vec<ParseError>` and a vector in Rust grows in an amortized fashion[2] which means that if an entry is pushed and the vector exceeds the currently allocated size, it will be ~doubled (though the exact growth-factor isn't constant). This essentially means that the buffer is growing exponentially pretty fast and - according to KDE heaptrack - my system allocated ~9.5GB after 20s while running some tests. I added an exit-condition to the loop traversing through `inherit`-subexpressions to avoid that. Checking for an "unexpected EOF" is actually sufficient here: * There's either a `;` later in the expression causing the loop to terminate and causing an actual "unexpected token" error then. * Otherwise, `parse_val` will go through the tokens until a matching semicolon is found (which is not the case) and then reach the end of the file. In that case, `unexpected EOF` is returned by `parse_val`. [1] nix-community/rnix-lsp#33 [2] https://www.cs.cornell.edu/courses/cs3110/2011sp/Lectures/lec20-amortized/amortized.htm
@fufexan btw would you mind building your rnix-lsp with this branch of |
@Ma27 I would, I'm just not sure how to build it, as I haven't worked with rust/cargo before, so I've got no clue how to replace the default |
Oh right, sorry! The thing is,
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should build rnix-parser, using this PR:
|
@fufexan did you have a chance to test this? :) |
@Ma27 yes, all works fine. Haven't had a memleak since switching :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great, and the code looks good to me. Thank you @Ma27!
First of all, big thanks to @fufexan who helped me to reliably reproduce
this.
Originally discovered in
rnix-lsp
[1], but I confirmed thatnixpkgs-fmt
is also affected.Basically, when having an expression such as
the parser would wait for a
TOKEN_SEMICOLON
indefinitely. The actualproblem however is that
self.parse_val()
always detects the SAMEsyntax-error, i.e. "unexpected EOF". This will be written indefinetely
into
self.errors
. However,errors
is of typeVec<ParseError>
and avector in Rust grows in an amortized fashion[2] which means that if an
entry is pushed and the vector exceeds the currently allocated size, it
will be ~doubled (though the exact growth-factor isn't constant).
This essentially means that the buffer is growing exponentially pretty fast
and - according to KDE heaptrack - my system allocated ~9.5GB after 20s
while running some tests.
I added an exit-condition to the loop traversing through
inherit
-subexpressions to avoid that. Checking for an "unexpected EOF"is actually sufficient here:
There's either a
;
later in the expression causing the loop toterminate and causing an actual "unexpected token" error then.
Otherwise,
parse_val
will go through the tokens until a matchingsemicolon is found (which is not the case) and then reach the end of
the file. In that case,
unexpected EOF
is returned byparse_val
.[1] nix-community/rnix-lsp#33
[2] https://www.cs.cornell.edu/courses/cs3110/2011sp/Lectures/lec20-amortized/amortized.htm