Skip to content

http: client inconsistent behavior with content length header #35958

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
delvedor opened this issue Nov 4, 2020 · 24 comments
Closed

http: client inconsistent behavior with content length header #35958

delvedor opened this issue Nov 4, 2020 · 24 comments
Labels
http Issues or PRs related to the http subsystem.

Comments

@delvedor
Copy link
Member

delvedor commented Nov 4, 2020

Hey folks!
I've encountered a weird bug. Every now and then it happens that the underlying connection between the client and the server breaks (not related to node, but to the infrastructure) while collecting the body. The easiest way to detect this is by checking the content-length header, as (as far as I know) there is no event we can listen to for detecting the broken connection while reading the body stream.
While testing the behavior of the client with the content-length header but with a body with an incorrect length, I've noticed that there is a big difference in how this is handled based on the node version. Following you can find the snippet for reproducing the issue and the behavior I've encountered based on the node version, the presence of the http agent, and the content-length header.

  • Version: latest 10/12/14/15
  • Platform: Mac OS
  • Subsystem: http

What steps will reproduce the bug?

The code for reproducing the bug can be found in the following gist:
https://gist.github.com/delvedor/0d15a34259fd2d2a21055f6e82a67380#file-http-client-bug-js

How often does it reproduce? Is there a required condition?

Consistently based on the versions of Node.

What is the expected behavior?

I'm not really sure what should be the right behavior, but I would expect:

  • If the content-length header is not respected, the client should throw an error.
  • The end event should be emitted in any case

What do you see instead?

The behavior changes between node versions.
You can find the order of the events in the same gist:
https://gist.github.com/delvedor/0d15a34259fd2d2a21055f6e82a67380#file-events-md

/cc @mcollina @ronag

@ronag
Copy link
Member

ronag commented Nov 4, 2020

I think an error/aborted event should be emitted and no end event.

@Sirius79
Copy link

Sirius79 commented Nov 4, 2020

Hi @ronag would this be a good first issue to work on?

@ronag
Copy link
Member

ronag commented Nov 4, 2020

No, I’m unsure if and what the problem here is. Behavior seems ok in v15 and fixing earlier versions is probably semver major

@delvedor
Copy link
Member Author

delvedor commented Nov 4, 2020

@ronag I think at least node v14 should be fixed and aligned to node v15, at the moment (in node v14) if the request has a content-length header bigger than the actual body, the response never ends.

@mcollina
Copy link
Member

mcollina commented Nov 5, 2020

@delvedor could you please articulate the difference between the various Node.js release lines?

@ronag
Copy link
Member

ronag commented Nov 5, 2020

@mcollina I think it's pretty well described in https://gist.github.com/delvedor/0d15a34259fd2d2a21055f6e82a67380#file-events-md

@mcollina
Copy link
Member

mcollina commented Nov 5, 2020

I think v14 should align either with v12 or v15.

@delvedor
Copy link
Member Author

delvedor commented Nov 5, 2020

I don't know what's the difference between v12 and v14 there, but I think it would be better to align v14 to v15, as v14 will stick around for a while. Better to adopt the right/new behavior sooner rather than later. What do you think?

@mmomtchev
Copy link
Contributor

I think an error/aborted event should be emitted and no end event.

@delvedor how does the connection 'break'? Sudden loss of connectivity that ends with a timeout? This should be an 'error' event.

@delvedor
Copy link
Member Author

delvedor commented Nov 5, 2020

how does the connection 'break'?

I don't have an answer to this, you can see the discussion that originated this issue here.

This should be an 'error' event.

You will get an error event if there is a connectivity error as soon as you try to send the request, but if the connection breaks while reading the body, no error event is emitted.

A way of reproducing this scenario was to set a content-length header bigger than the body, which causes a different behavior based on the version of node.

@mmomtchev
Copy link
Contributor

mmomtchev commented Nov 5, 2020

Maybe there is really a an issue with the content length, I will try digging, but I don't think it is a reliable way to verify that the HTTP request was successful - absence of 'error' should be the right way to do it
And there should definitely be an 'error' event if your connection is reset
How are you getting that ECONNRESET if it is not with an 'error' event?
For me the absence of 'error' event when the connection was reset is the far more serious problem

@mmomtchev
Copy link
Contributor

mmomtchev commented Nov 5, 2020

@delvedor, I just went through your gist, I confirm that you get different behavior from different versions of Node - which is not very good - but this is the very obscure case of a server that responds with a content-length that is longer than the actual body and then proceeds to cleanly close the connection - IMHO this is not a real-world scenario, it is more like an HTTP compliance test.
I don't think this is what actually happens in your case
The ECONNRESET you see here is an artificial ECONNRESET sent by the Node HTTP client to indicate a premature close - it is not a real network error

@mcollina
Copy link
Member

mcollina commented Nov 5, 2020

@delvedor, I just went through your gist, I confirm that you get different behavior from different versions of Node - which is not very good - but this is the very obscure case of a server that responds with a content-length that is longer than the actual body and then proceeds to cleanly close the connection - IMHO this is not a real-world scenario, it is more like an HTTP compliance test.

I have seen a few situation where this actually happens. Most of those cases are bugs either in the server or a proxy middleman that for some reason truncates the response. You are right that it is about HTTP compliance and somewhat an edge case to consider. Nevertheless, I think the behavior of v14 regarding to this is not correct and we should do something about it.

@ronag is there a PR of yours that would need to be backported? Do you know which ones of the PRs changed this, as we can also revert that.

@mmomtchev
Copy link
Contributor

mmomtchev commented Nov 5, 2020

I agree that Node's event behavior should be consistent across versions, but I wonder if this gist is representative of his real problem. @delvedor are you positive that you get a truncated body with no network error? This is a serious issue, far more serious than Node's inconsistent behavior on invalid content-length. Relying on the content-length is an ugly kludge at best. From what I understand, this happens on GCP, so you cannot run network sniffers? Is this HTTP server accessible from the outside world and can you try running your HTTP client outside GCP so that you can trace what happens on the network level?

That last message on your discussion mentions an ECONNRESET on read - this is a real networking issue and it is not the same as the one produced by your gist. And I see a JS stack trace - so you must be getting an 'error' event somewhere.

@mmomtchev
Copy link
Contributor

@delvedor, check #35824 - @lpinca 's remark and my last comment, maybe this is the 'error' you are getting

@mmomtchev
Copy link
Contributor

mmomtchev commented Nov 5, 2020

@mcollina @delvedor @ronag Node 12 and Node 14 both emit an 'aborted' message on the response object in this case - normal close before the end of the HTTP response - which is not considered an error and won't have any effect if it is not explicitly handled - but can be listened to
(just make sure you are listening on the response object)
Node 15 will also emit an 'error' - and only if there is an active handler - in order to be compatible with all the existing user code that was not handling this error

if (res.listenerCount('error') > 0) {

This is all that needs changing if you want to reproduce Node 15 behavior

@ronag
Copy link
Member

ronag commented Nov 5, 2020

ah yes, @delvedor you need to listen to res.'aborted' in pre node v15. This might actually not be a problem then.

@mmomtchev
Copy link
Contributor

The only case that still needs some work is the 'keepAlive: true` case, but this is a nightmarish case - just look at the situation - the remote server announces a content-length, then sends some of the data, then stops without closing the connection and then keeps it alive with no errors. This is currently handled in all versions (ie Node recovers) but the behavior is not very consistent across different versions.
My opinion is that the benefit of consistency doesn't justify the risk of juggling around all that legacy code.
@delvedor just one tip, there are network proxies that can simulate packet errors on the wire.

@delvedor
Copy link
Member Author

delvedor commented Nov 6, 2020

I was not aware of the response.on('aborted') event, from my test, it looks that solves this issue very nicely in any Node.js version (from 10 to 15).
@ronag in Node v15 the 'aborted' event gets emitted in any case :)

I'll do some additional tests, before confirming it, but at this point, I'm positive the 'aborted' is what I need, thanks!

@delvedor
Copy link
Member Author

delvedor commented Nov 6, 2020

Do you recommend to call incomingMessage.destroy in case of the aborted event?
For example:

incomingMessage.on('aborted', () => {
  incomingMessage.destroy()
  handleError(new Error('kaboom'))
})

@mmomtchev
Copy link
Contributor

Just be aware of #35923 which is probably staying like this.
What is your HTTP server software? Is it Node or is it something else?

@delvedor
Copy link
Member Author

delvedor commented Nov 7, 2020

@mmomtchev In this case Elasticsearch, but the error happens when there is a lot of infrastructure the middle.

Thanks for the linked issue.
I'm not sure if I should call or not response.destroy (without an error, I'm handling the aborted event in a different way). In my test there is no difference if I call it or not, so I would say that I can, just to be sure that everything will be cleaned up.

@mmomtchev
Copy link
Contributor

@delvedor I prefer to leave it to @ronag or @mcollina for the official way to do it

I was asking for your infrastructure, because we just found an ECONNRESET problem with Node servers running on Windows
Try sniffing the network to see what is happening, or if this is not possible, try to locally induce ECONNRESET errors - you can do it by fiddling with iptables or pf and there are specifically designed network proxies

@targos targos added the http Issues or PRs related to the http subsystem. label Dec 27, 2020
@marco-ippolito
Copy link
Member

this was probably solved with #46601, feel free to reopen if it's not the case

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
http Issues or PRs related to the http subsystem.
Projects
None yet
Development

No branches or pull requests

7 participants