You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If the Python code takes too long to run between the POST /v1/exec to start a Pebble exec, and connecting to all (2 or 3, depending on combine_stderr) of the websockets, the Python code will hang indefinitely and never time out. This is because there's no socket timeout set during the websocket connect phase, and Pebble waits rather than rejecting the connection (even though it's already exceeded its waitIOConnected timeout).
You can test this by adding time.sleep(5.1) between the POST and the websocket connections, here:
diff --git a/ops/pebble.py b/ops/pebble.py
index 831c778..ad2fb3e 100644
--- a/ops/pebble.py+++ b/ops/pebble.py@@ -2751,6 +2751,7 @@ class Client:
stderr_ws: Optional[_WebSocket] = None
try:
+ time.sleep(5.1)
control_ws = self._connect_websocket(task_id, 'control')
stdio_ws = self._connect_websocket(task_id, 'stdio')
if not combine_stderr:
Then fire up pebble run in one terminal, and run a Pebble exec in another:
$ .tox/unit/bin/python -m test.pebble_cli exec -- echo foo
# after 5s, the Pebble logs will show "timeout waiting for websocket connections",
# but it will hang here
We should almost certainly have a (relatively short) timeout on the socket during connect. Though we have to unset the timeout during after it's connected, as the websockets for control and stdio are essentially long-polling, and will wait an arbitrary amount of time till input arrives. With this fix, we get this (after 10s = 5s Pebble timeout + 5s connect timeout):
$ .tox/unit/bin/python -m test.pebble_cli --socket=/var/lib/pebble/default/.pebble.socket exec -- echo foo
ChangeError: cannot perform the following tasks:
- Execute command "echo" (exec 31: timeout waiting for websocket connections: context deadline exceeded)
----- Logs from task 0 -----
2024-06-05T16:08:18+12:00 ERROR exec 31: timeout waiting for websocket connections: context deadline exceeded
-----
We can probably also improve Pebble's handling of this, as it should know that the waitIOConnected bit has already timed out, but fixing it in Ops will be a great start! I'll push up a PR soon.
The text was updated successfully, but these errors were encountered:
…cal#1247)
This is the fix for the issue described at
canonical#1246. Essentially, if the
Pebble timeout has already elapsed, Pebble will happily wait
indefinitely for the connect to go through, and the Python side will
hang. Add a timeout during the connect phase to cut this short.
Fixescanonical#1246.
---------
Co-authored-by: Tony Meyer <tony.meyer@gmail.com>
If the Python code takes too long to run between the
POST /v1/exec
to start a Pebble exec, and connecting to all (2 or 3, depending oncombine_stderr
) of the websockets, the Python code will hang indefinitely and never time out. This is because there's no socket timeout set during the websocket connect phase, and Pebble waits rather than rejecting the connection (even though it's already exceeded its waitIOConnected timeout).You can test this by adding
time.sleep(5.1)
between the POST and the websocket connections, here:Then fire up
pebble run
in one terminal, and run a Pebble exec in another:We should almost certainly have a (relatively short) timeout on the socket during connect. Though we have to unset the timeout during after it's connected, as the websockets for control and stdio are essentially long-polling, and will wait an arbitrary amount of time till input arrives. With this fix, we get this (after 10s = 5s Pebble timeout + 5s connect timeout):
We can probably also improve Pebble's handling of this, as it should know that the
waitIOConnected
bit has already timed out, but fixing it in Ops will be a great start! I'll push up a PR soon.The text was updated successfully, but these errors were encountered: