Skip to content

Retry long-poll failures, with backoff #184

Closed
@gnprice

Description

@gnprice

A Zulip client long-polls an event queue on the server to get events. In the current prototype, if that request fails for any reason, the long-poll loop just aborts:

  void poll() async {
    while (true) {
      final result = await getEvents(connection,
        queueId: queueId, lastEventId: lastEventId);
      // …

Instead, if the request fails with an error that we can expect to be transient, we should retry, with appropriate backoff.

split off as #514: [We should also ensure that if the request doesn't come back within an appropriate timeout, then we treat it as having failed and retry. The Zulip server should always respond to a long-poll within at most about 60 seconds, with a heartbeat event if there's no information to convey. So if it's been much longer than that and we haven't gotten a response, then we should assume none is coming — the request or response must have been lost somewhere. (This part might get split out as a separate issue.)]

For which types of errors we can expect to be transient and should therefore retry, see zulip-mobile at tryFetch:

            if (!(shouldRetry && (e instanceof Server5xxError || e instanceof NetworkError))) {
              throw e;

For how to control backoff, see zulip-mobile at BackoffMachine, background discussion at zulip/zulip-mobile#3841 (comment) , and the use of BackoffMachine in tryFetch. I think we can basically transcribe BackoffMachine from JS to Dart and reuse that.

Metadata

Metadata

Assignees

Labels

a-apiImplementing specific parts of the Zulip server APIa-syncEvent queue; retry; local echo; racesbeta feedbackThings beta users have specifically asked for

Type

No type

Projects

Status

Done

Relationships

None yet

Development

No branches or pull requests

Issue actions