-
Notifications
You must be signed in to change notification settings - Fork 656
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Avoid keeping hold of partial bytes forever. #984
Merged
Lukasa
merged 2 commits into
apple:master
from
Lukasa:cb-no-leftovers-after-drip-feeding
Apr 29, 2019
Merged
Avoid keeping hold of partial bytes forever. #984
Lukasa
merged 2 commits into
apple:master
from
Lukasa:cb-no-leftovers-after-drip-feeding
Apr 29, 2019
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Motivation: The HTTPDecoder is a complex object that has very careful state management goals. One source of this complexity is that it is fed a stream of bytes with arbitrary chunk sizes, but needs to produce a collection of objects that are contiguous in memory. For example, each header field name and value must be turned into a String, which requires a contiguous sequence of bytes to do. As a result, it is quite common to have a situation where the HTTPDecoder has only *part* of an object that must be emitted atomically. In this situation, the HTTPDecoder would like to instruct its ByteToMessageHandler to keep hold of the bytes that form the beginning of that object. To avoid asking http_parser to parse those bytes twice, the HTTPDecoder uses a value called httpParserOffset to keep track. As an example, consider what would happen if the "Connection: keep-alive\r\n" header field was delivered in two chunks: first "Connection: keep-al", and then "ive\r\n". The header field name can be emitted in its entirety, but the partial field value must be preserved. To achieve this, the HTTPDecoder will store an offset internally to keep track of which bytes have been parsed. In this case, the offset will be set to 7: the number of bytes in "keep-al". It will then tell the rest of the code that only 12 bytes of the original 19 byte message were consumed, causing the ByteToMessageHandler to preserve those 7 bytes. However, when the next chunk is received, the ByteToMessageHandler will *replay* those bytes to HTTPDecoder. To avoid parsing them a second time, HTTPDecoder keeps track of how many bytes it is expecting to see replayed. This is the value in httpParserOffset. Due to a logic error in the HTTPDecoder, the httpParserOffset field was never returned to zero. This field would be modified whenever a partial field was received, but would never be returned to zero when a complete message was parsed. This would cause the HTTPDecoder to unnecessarily keep hold of extra bytes in the ByteToMessageHandler even when they were no longer needed. In some cases the number could get smaller, such as when a new partial field was received, but it could never drop to zero even when a complete HTTP message was receivedincremented. Happily, due to the rest of the HTTPDecoder logic this never produced an invalid message: while ByteToMessageHandler was repeatedly producing extra bytes, it never actually passed them to http_parser again, or caused any other issue. The only situation in which a problem would occur is if the HTTPDecoder had a RemoveAfterUpgradeStrategy other than .dropBytes. In that circumstance, decodeLast would not consume any extra bytes, but those bytes would have remained in the buffer passed to decodeLast, which would then incorrectly *forward them on*. This is the only circumstance in which this error manifested, and in most applications it led to surprising and irregular crashes on connection teardown. In all other applications the only effect was unnecessarily preserving a few tens of extra bytes on some connections, until receiving EOF caused us to drop all that memory anyway. Modifications: - Return httpParserOffset to 0 when a full message has been delivered. Result: Fewer weird crashes.
weissi
approved these changes
Apr 29, 2019
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! And sorry for the complexity in HTTPDecoder
:'(
weissi
pushed a commit
to weissi/swift-nio
that referenced
this pull request
Apr 30, 2019
Motivation: The HTTPDecoder is a complex object that has very careful state management goals. One source of this complexity is that it is fed a stream of bytes with arbitrary chunk sizes, but needs to produce a collection of objects that are contiguous in memory. For example, each header field name and value must be turned into a String, which requires a contiguous sequence of bytes to do. As a result, it is quite common to have a situation where the HTTPDecoder has only *part* of an object that must be emitted atomically. In this situation, the HTTPDecoder would like to instruct its ByteToMessageHandler to keep hold of the bytes that form the beginning of that object. To avoid asking http_parser to parse those bytes twice, the HTTPDecoder uses a value called httpParserOffset to keep track. As an example, consider what would happen if the "Connection: keep-alive\r\n" header field was delivered in two chunks: first "Connection: keep-al", and then "ive\r\n". The header field name can be emitted in its entirety, but the partial field value must be preserved. To achieve this, the HTTPDecoder will store an offset internally to keep track of which bytes have been parsed. In this case, the offset will be set to 7: the number of bytes in "keep-al". It will then tell the rest of the code that only 12 bytes of the original 19 byte message were consumed, causing the ByteToMessageHandler to preserve those 7 bytes. However, when the next chunk is received, the ByteToMessageHandler will *replay* those bytes to HTTPDecoder. To avoid parsing them a second time, HTTPDecoder keeps track of how many bytes it is expecting to see replayed. This is the value in httpParserOffset. Due to a logic error in the HTTPDecoder, the httpParserOffset field was never returned to zero. This field would be modified whenever a partial field was received, but would never be returned to zero when a complete message was parsed. This would cause the HTTPDecoder to unnecessarily keep hold of extra bytes in the ByteToMessageHandler even when they were no longer needed. In some cases the number could get smaller, such as when a new partial field was received, but it could never drop to zero even when a complete HTTP message was receivedincremented. Happily, due to the rest of the HTTPDecoder logic this never produced an invalid message: while ByteToMessageHandler was repeatedly producing extra bytes, it never actually passed them to http_parser again, or caused any other issue. The only situation in which a problem would occur is if the HTTPDecoder had a RemoveAfterUpgradeStrategy other than .dropBytes. In that circumstance, decodeLast would not consume any extra bytes, but those bytes would have remained in the buffer passed to decodeLast, which would then incorrectly *forward them on*. This is the only circumstance in which this error manifested, and in most applications it led to surprising and irregular crashes on connection teardown. In all other applications the only effect was unnecessarily preserving a few tens of extra bytes on some connections, until receiving EOF caused us to drop all that memory anyway. Modifications: - Return httpParserOffset to 0 when a full message has been delivered. Result: Fewer weird crashes. (cherry picked from commit ae3d298)
# for free
to join this conversation on GitHub.
Already have an account?
# to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation:
The HTTPDecoder is a complex object that has very careful state management goals. One source of this
complexity is that it is fed a stream of bytes with arbitrary chunk sizes, but needs to produce a
collection of objects that are contiguous in memory. For example, each header field name and value
must be turned into a String, which requires a contiguous sequence of bytes to do.
As a result, it is quite common to have a situation where the HTTPDecoder has only part of an
object that must be emitted atomically. In this situation, the HTTPDecoder would like to instruct
its ByteToMessageHandler to keep hold of the bytes that form the beginning of that object. To avoid
asking http_parser to parse those bytes twice, the HTTPDecoder uses a value called httpParserOffset
to keep track.
As an example, consider what would happen if the "Connection: keep-alive\r\n" header field was delivered
in two chunks: first "Connection: keep-al", and then "ive\r\n". The header field name can be emitted in
its entirety, but the partial field value must be preserved. To achieve this, the HTTPDecoder will store
an offset internally to keep track of which bytes have been parsed. In this case, the offset will be set
to 7: the number of bytes in "keep-al". It will then tell the rest of the code that only 12 bytes of the
original 19 byte message were consumed, causing the ByteToMessageHandler to preserve those 7 bytes.
However, when the next chunk is received, the ByteToMessageHandler will replay those bytes to
HTTPDecoder. To avoid parsing them a second time, HTTPDecoder keeps track of how many bytes it is
expecting to see replayed. This is the value in httpParserOffset.
Due to a logic error in the HTTPDecoder, the httpParserOffset field was never returned to zero.
This field would be modified whenever a partial field was received, but would never be returned
to zero when a complete message was parsed. This would cause the HTTPDecoder to unnecessarily keep
hold of extra bytes in the ByteToMessageHandler even when they were no longer needed. In some cases
the number could get smaller, such as when a new partial field was received, but it could never drop
to zero even when a complete HTTP message was receivedincremented.
Happily, due to the rest of the HTTPDecoder logic this never produced an invalid message: while
ByteToMessageHandler was repeatedly producing extra bytes, it never actually passed them to http_parser
again, or caused any other issue. The only situation in which a problem would occur is if the HTTPDecoder
had a RemoveAfterUpgradeStrategy other than .dropBytes. In that circumstance, decodeLast would not
consume any extra bytes, but those bytes would have remained in the buffer passed to decodeLast, which
would then incorrectly forward them on. This is the only circumstance in which this error manifested,
and in most applications it led to surprising and irregular crashes on connection teardown. In all
other applications the only effect was unnecessarily preserving a few tens of extra bytes on
some connections, until receiving EOF caused us to drop all that memory anyway.
Modifications:
Result:
Fewer weird crashes.