Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Replication reference with node_confirms #1857

Open
martinsumner opened this issue Apr 19, 2023 · 1 comment
Open

Replication reference with node_confirms #1857

martinsumner opened this issue Apr 19, 2023 · 1 comment

Comments

@martinsumner
Copy link
Contributor

When replicating objects using nextgenrepl, the sink cluster will issue fetch requests from the source cluster. These fetch requests will read from the real-time queue any item ready for replication. the item will either be:

  • An actual object which has recently been PUT;
  • A reference to an object which exists in the store;
  • A delete reference.

The second case will commonly occur when a repl_keys_range aae_fold has been made (but also when the real-time queue has grown during a busy period).

In the case of an object reference being read from the queue, a standard GET request will be used to return the actual object to the sink.

That GET request will have some specific options to improve performance:

Options = [deletedvclock, {pr, 1}, {r, 1}, {notfound_ok, false}],

These objects allow the object to be returned to the client as soon as an object matching the expected vector clock has been returned. As this might happen after only 1 read, the R/PR settings over-ride any bucket property requiring a higher R value.

However, if the bucket has node_confirms set to a value more than 1 - the response will fail validation at this stage.

@martinsumner
Copy link
Contributor Author

It is possible to create unhandled pressure, especially when using repl range folds. The standard mechanism for tuning this is manipulating the number of snk_workers fetching and pushing.

An alternative would to allow the r value on fetch, and the w value on push to be over-ridden. This will slow replication, but by allowing for more than one vnode to confirm before completing the operation there will be a natural break on snk_workers when vnodes start developing backlogs.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant