-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
listdir() sometimes blocks program #117
Comments
The infinite loop is fixed but the underlying block is still present or does that PR fix this whole issue for you? I'm hoping to go through the PR queue and issues sometime soon so I can push a new release. It's been a hectic few months for me unfortunately so I haven't had the time to really work on this lately. |
Thanks @jborean93, for reply.
|
@jborean93 The issue just occurred again and the behavior was as expected:
|
This one sounds like it's going to be tricky to solve. Based on your description it sounds like the response for the Another route is to have a timeout while waiting for the response and have that raise an exception if the response wasn't received in time. This might be the way forward but I'm unsure what a sane value should be here for this. Should we just wait 30 seconds, a minute, something else. How would that apply to things that are expected to wait for ages like a change notification request or just a busy server unable to confirm the data was written to the underlying storage. |
Timeouts usually cause further problems, but with network communication involved it is unfortunately not possible without it. From a user perspective I would expect one of the two concepts:
Change notification requests are special and I do not have enough insides into the smb protocol to really comment on this. The Server has to response to a subscription creation, so a subscription could be treated as a normal request (with client side timeout). Later, notifications are send by the server, so a server side timeout should be used. Checking the healthiness of the connection is maybe partially implemented: smbprotocol/smbprotocol/connection.py Lines 1120 to 1127 in e4c8b6a
The echo command is used here to keep the connect alive, but the server sends also a response. If there is no response (again with timeout) the connection is already dead. If you use that and maybe also increase the echo frequency (current interval is 10 min), then in my case the blockage could be less than 16 min. An API-based timeout could also be built in here: Always if a request is added to or removed from self.outstanding_requests the next timeout time could be estimated and used for the self.transport.recv call. The implementation in detail also depends on whether we assume that it is possible that only individual requests go wrong and the connection itself remains intact or it is more more likely that the entire connection breaks off and everything is lost from then on.
|
The use of listdir() sometimes yields to a blocked program.
Expected Behavior
TimeoutError: [Errno 110] Connection timed out
should be handled.Current Behavior
The use case:
Sometimes the application hangs during the call of smbclient.listdir(). Until that point, the application usually runs for many hours, sometimes a few days.
During such failure state I retrieved debug information with faulthandler.dump_traceback(all_threads=True) (I removed the output for some Threads (DB connection and debugging)):
For 10 Minutes the Threads are just waiting. Then the timeout of 10min applies:
After that both Threads are waiting like before.
After about 16 minutes (from calling listdir()) an Exception occurs:
After this exception the smbprotocol module ends in an infinite loop of
OSError: [Errno 107] Transport endpoint is not connected
fromself._sock.shutdown(socket.SHUT_RDWR)
. This seems to be fixed with the newest PR (I just mention the last part of the issue for completeness and to endorse the release of the latest version).The text was updated successfully, but these errors were encountered: