Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Clear socket.waiting before invoking callbacks #819

Merged
merged 1 commit into from
Feb 20, 2018

Conversation

thomaslee
Copy link
Contributor

If we pause() and then later resume() a consumer within a 'done' callback, we can get into a situation where the consumer doesn't actually resume because sendToBroker bails out early (because broker.socket.waiting is still true while the 'done' callback is being invoked, so the fetch request gets thrown away).

To work around this I've had to use setImmediate(() => consumer.resume()) to schedule the resume() to occur after the socket.waiting flag has been reset. It works well enough, but it's not very intuitive.

This change simply moves the socket.waiting reset to occur before we fire off the decoder & associated callbacks. I think this is pretty low risk, but perhaps you can think of a reason why we wouldn't want to do this.

Something else to consider: it can be somewhat surprising that requests to brokers are simply "thrown away" if the waiting flag is set: at a minimum, a log message before the continue in Client.sendToBroker would be great (I can add this if you like, it's easy enough).I fFeel like a more "correct" approach would be to fire the callback before that same continue so callers can at least notice that something went screwy, but that seems like a larger change.

@thomaslee
Copy link
Contributor Author

@hyperlink any thoughts on this one? The last part RE: requests thrown away is especially concerning since it means callbacks passed to sendToBroker may never get invoked. (Probably strictly a separate issue -- happy to open a separate ticket for that if it helps.)

@hyperlink
Copy link
Collaborator

@thomaslee I am not too familiar with the history of this code. And I don't understand why it's necessary to have a dedicated socket for fetches that blocks requests using the wait flag. Do you know? Or maybe @haio or @crzidea could shed some light on this?

I have tried in another branch to do away with this behavior and I've ran into some concerning intermittent test failures in travis.

I will kick off the build again and merge this if it's ok.

@hyperlink hyperlink merged commit 850e2d6 into SOHU-Co:master Feb 20, 2018
@thomaslee
Copy link
Contributor Author

Thanks very much for landing this. 🙂

And I don't understand why it's necessary to have a dedicated socket for fetches that blocks requests using the wait flag.

I actually didn't realize there were two sockets at play. Only thing I can think of is that on low-volume topics those fetches could be quite slow (since callers can tell the broker to wait some user-specified amount of time for messages to arrive) ... perhaps the original implementors were concerned about other (non-fetch) requests being sent while a fetch was in flight and "blocked"? My instinct is this shouldn't actually be a problem thanks to correlation IDs etc. -- a "blocked" fetch shouldn't delay other requests like topic metadata requests etc. -- but I don't know enough about the protocol to say for certain.

I have tried in another branch to do away with this behavior and I've ran into some concerning intermittent test failures in travis.

I'll take a look when I get a sec, but I fear I may not be much help. Still trying to get my head around some of the basics ... the interactions between different layers of the inheritance hierarchy on the consumer side still hurt my head a bit 😃 .

@crzidea
Copy link
Member

crzidea commented Mar 7, 2018

It is a long time since I wrote the first version of this module, so I can't remember everything clearly.
But it seems that longpolling socket can't fetch broker info while longpolling socket is working, so it's necessary to have a dedicated socket to fetch broker info. Actually, I'm not sure about this.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants