You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Sometimes, a deployment that uses the @serve.batch API, when serving requests that get cancelled by the client, will enter a permanently stuck state. New requests will not execute, they will hang until a client-side timeout happens.
Versions / Dependencies
2.41
Reproduction script
import ray
from ray import serve
from starlette.requests import Request
import asyncio
@serve.deployment(max_ongoing_requests=10)
class MyDeployment:
@serve.batch(max_batch_size=8, batch_wait_timeout_s=0.001)
async def __call__(self, http_request: List[Request]):
model_input = [await req.json() for req in http_request]
await asyncio.sleep(float(model_input[0]["time"]))
entrypoint = MyDeployment.bind()
Sending a stream of traffic to this deployment can cause the deployment to enter a stuck state.
Issue Severity
None
The text was updated successfully, but these errors were encountered:
zcin
added
bug
Something that is supposed to be working; but isn't
P0
Issues that should be fixed in short order
serve
Ray Serve Related Issue
labels
Jan 23, 2025
What happened + What you expected to happen
Sometimes, a deployment that uses the
@serve.batch
API, when serving requests that get cancelled by the client, will enter a permanently stuck state. New requests will not execute, they will hang until a client-side timeout happens.Versions / Dependencies
2.41
Reproduction script
Sending a stream of traffic to this deployment can cause the deployment to enter a stuck state.
Issue Severity
None
The text was updated successfully, but these errors were encountered: