Redirect client to better server #274

oliverhausler · 2020-10-11T16:16:40Z

The NATS protocol lets clients connect to an arbitrary server, which is simple and generic, but often there is a better server (closer geolocation, the same server where a channel is currently published, etc.).

A lot of traffic could be offloaded from the own cluster if there would be a mechanism which tells a client to connect to another, better suitable server for a certain subscription.

LaPetiteSouris · 2020-10-11T18:36:45Z

I think it is somehow related to this feature

If you want to subscribe for read-only purpose, for now you can try using ReadISRReplica . This may not be precisely what you want but for now it can somehow help in fanning out messages and reduce workload for leader.

oliverhausler · 2020-10-11T23:36:59Z

@LaPetiteSouris Yes, I was close to adding it to #219, but it is not exactly the same, and a server in a close geo-location is not necessarily the best server. It depends on the use case, I guess.

If you think of chat rooms, geo-proximity or a geo-cluster is most probably the way to go. But this can also be different, think of something like YouTube [I know it's different, but only looking at the load distribution here], where videos are stored on a certain server. When a user subscribes to watch a video, the best server is probably the one which has the video locally stored, not the one close to the user. Only when the same video is served very often simultaneously, geo-proximity may win (and probably not for the data provider, only when looking at the network load).

Generically speaking, this is something many people get wrong. Everybody talks about edge, but having a client contact an edge server is only useful when the data is stored at the edge. As soon as the edge server must pull the data long-distance (worst if several round-trips are required for data requests), having the client connect to a server in close proximity to the data often wins, or even having a local cluster in a single geolocation probably wins.

tylertreat · 2020-10-12T17:37:10Z

I think this would be a great feature. Would need to think through how to implement properly, especially since it depends heavily on the use case as you mention.

If you want to subscribe for read-only purpose, for now you can try using ReadISRReplica.

My thought was to eventually extend ReadISRReplica to make it smarter, e.g. by subscribing from the closest geo replica. This feature might play into how that would work.

oliverhausler · 2020-10-12T20:52:44Z

It could even be a simple client feature, where the server sends a "redirect recommendation" with a list of better servers. It would then be upon the client to either disconnect or not.

tylertreat · 2020-10-12T21:22:35Z

It might make sense to piggyback this information on the FetchMetadata endpoint.

LaPetiteSouris · 2020-10-15T18:29:57Z

It might make sense to piggyback this information on the FetchMetadata endpoint.

In such case the server should have geo-location awareness.

Otherwise, we may go up with a simple Load Balancing module. In fact I think it is hard to have strict and limited rule sets to decide "best servers". As IMO, the rule sets to decide, or to score a server in the cluster may differ greatly based on specific scenarios. Thus, a server judged as "best" or "next best" for one use case may not be even close to "good" for other use cases.

Agreed that the current way of giving a server to subscribe has lots of rooms for improvement, but I think it would be nice to come up first with something generic and simple enough to implement in the first time.

Some rules to score a server in the new Load Balancing modules may be:

Round robin : Go to the next server in the list ?
Least response time.
Least connections. (This may be suitable for FetchMetadata)
Geolocation distance

With that in mind, I do not know if it is actually Metadata or AggregatedMetrics we should focus on. If it is 2nd case, then issue #222 is related

tylertreat · 2020-10-15T20:12:32Z

@LaPetiteSouris Agreed. Also, I would like to revisit how the client does connection management to leverage gRPC's components like Resolver and Picker. These are more extensible for allowing different implementations for how connections are selected/balanced (see liftbridge-io/go-liftbridge#89).

oliverhausler · 2020-10-15T20:35:44Z

@LaPetiteSouris agreed, another algorithm:

Closest to publisher
Cheapest (servers have lower or higher data cost, based on location and provider)

LaPetiteSouris · 2020-10-18T13:10:22Z

@LaPetiteSouris Agreed. Also, I would like to revisit how the client does connection management to leverage gRPC's components like Resolver and Picker. These are more extensible for allowing different implementations for how connections are selected/balanced (see liftbridge-io/go-liftbridge#89).

A prerequisite for that issue would be the server to expose required information first. E.g: information on geolocation, on the number of connections...etc

LaPetiteSouris · 2021-12-07T23:19:52Z

I suggest we close this one. Although liftbridge-io/go-liftbridge#114 is a bit rudimentary , it already provides this feature. Let's open another issue if needed in the future

tylertreat added the enhancement label Oct 12, 2020

tylertreat closed this as completed Dec 7, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Redirect client to better server #274

Redirect client to better server #274

oliverhausler commented Oct 11, 2020

LaPetiteSouris commented Oct 11, 2020

oliverhausler commented Oct 11, 2020

tylertreat commented Oct 12, 2020

oliverhausler commented Oct 12, 2020

tylertreat commented Oct 12, 2020

LaPetiteSouris commented Oct 15, 2020 •

edited

Loading

tylertreat commented Oct 15, 2020

oliverhausler commented Oct 15, 2020 •

edited

Loading

LaPetiteSouris commented Oct 18, 2020

LaPetiteSouris commented Dec 7, 2021 •

edited

Loading

Redirect client to better server #274

Redirect client to better server #274

Comments

oliverhausler commented Oct 11, 2020

LaPetiteSouris commented Oct 11, 2020

oliverhausler commented Oct 11, 2020

tylertreat commented Oct 12, 2020

oliverhausler commented Oct 12, 2020

tylertreat commented Oct 12, 2020

LaPetiteSouris commented Oct 15, 2020 • edited Loading

tylertreat commented Oct 15, 2020

oliverhausler commented Oct 15, 2020 • edited Loading

LaPetiteSouris commented Oct 18, 2020

LaPetiteSouris commented Dec 7, 2021 • edited Loading

LaPetiteSouris commented Oct 15, 2020 •

edited

Loading

oliverhausler commented Oct 15, 2020 •

edited

Loading

LaPetiteSouris commented Dec 7, 2021 •

edited

Loading