Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Force gevent connected socket to blocking mode #1616

Merged
merged 1 commit into from
Nov 12, 2017

Conversation

jkemp101
Copy link
Contributor

@jkemp101 jkemp101 commented Oct 6, 2017

The gevent worker assumes sockets are running in blocking mode. The current code forces the listening socket into blocking mode here. But the connected socket is not forced to blocking mode. Therefore if other code has set the default socket timeout with a call to setdefaulttimeout it will be set as the connected socket's timeout in the gevent initialization code here.

The gevent worker relies on the keepalive timeout to close idle connections. Setting the default timeout below the keepalive timeout causes the worker to run in a undefined mode. I believe this may fix the issues in #880 but it was not clear if everyone having that issue were also calling setdefaulttimeout. In our particular case we are calling setdefaulttimeout which caused sockets to close before the keepalive expired. This causes a bunch of socket timeout error messages and potentially 504 errors with AWS ELBs.

@jkemp101 jkemp101 closed this Oct 6, 2017
@jkemp101 jkemp101 reopened this Oct 9, 2017
@tilgovi
Copy link
Collaborator

tilgovi commented Oct 10, 2017

Thank you for providing such detailed information and a patch.

The patch is simple and it sounds like it would work, but do you think that the assumption Gunicorn makes about the socket being blocking is a bad one? Should we discuss any alternative changes?

@jkemp101
Copy link
Contributor Author

I was also concerned about that but came to the conclusion its probably ok. This gets a bit confusing since we are talking about blocking sockets that in the gevent world aren't necessarily really blocking anything.

My thoughts:

  1. The default value for the timeout puts the connected socket in blocking mode. Therefore the majority of the gevent worker users are already running with blocking sockets. I would guess most people calling setdefaulttimeout don't realize the implications to the gevent worker.
  2. The gevent.Timeout object in timeout_ctx seems to handle "idle" sockets that are waiting to receive data. The keepalive timeout seems appropriate here to close connections. This was important to me because AWS CLBs do a lot of pre-opened connections that may never send a request. I wanted to make sure if the CLB didn't close these connections after a reasonable amount of time they wouldn't stack up in gunicorn. The keepalive timer kicked in and closed these connections even when no request had been set.
  3. I believe setting the default timeout also affects sending data to clients. With blocking sockets the send would theoretically block the greenlet until the underlying TCP timers detected the socket was broken and closed it. I don't think we would want to use the keepalive timeout for this anyway so a new send timeout would probably be needed with an associated refactoring of the send response code. This seems like a pretty unlikely edge case.

@tilgovi
Copy link
Collaborator

tilgovi commented Oct 11, 2017

Thanks for all that analysis. I agree with you. It does sound like it would fix #880, too. At least one comment there suggests that changing the socket timeout fixed the problem.

@benoitc benoitc merged commit 61431d4 into benoitc:master Nov 12, 2017
endreymarcell pushed a commit to prezi/gunicorn that referenced this pull request Jan 3, 2018
mjjbell pushed a commit to mjjbell/gunicorn that referenced this pull request Mar 16, 2018
…ed-socket

Force gevent connected socket to blocking mode
@jkemp101 jkemp101 deleted the fix-non-blocking-connected-socket branch March 16, 2018 15:42
@altaurog altaurog mentioned this pull request Mar 18, 2018
8 tasks
sokac pushed a commit to sokac/thrift that referenced this pull request Apr 27, 2018
Timeout is already set in HttpClient. The reason for removal is issue
with gunicorn and gevent as described here
benoitc/gunicorn#1616
sokac pushed a commit to sokac/thrift that referenced this pull request Apr 27, 2018
Timeout is already set in HttpClient. The reason for removal is issue
with gunicorn and gevent as described here
benoitc/gunicorn#1616
jeking3 pushed a commit to apache/thrift that referenced this pull request May 7, 2018
Timeout is already set in HttpClient. The reason for removal is issue
with gunicorn and gevent as described here
benoitc/gunicorn#1616
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants