Skip to content

Detach sinks and backpressure from topic-partitions #786

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Merged
merged 8 commits into from
Mar 13, 2025

Conversation

daniil-quix
Copy link
Collaborator

@daniil-quix daniil-quix commented Mar 12, 2025

Motivation

Currently, Sinks are tightly coupled with topic partitions, which doesn't work well with the upcoming joins and merges.

This PR removes the dependency on TPs from certain Sinks methods.
BatchingSink classes keep batching data per TP to reduce the amount of the changed code, but BatchingSink.flush() will flush all the batches.

Main changes

  1. Sinks are now flushed first on each checkpoint before producing changelogs.
    This is done to minimize potential over-production in case of Sink's failure.
  2. Updated Sink.flush() - it is now expected to flush all the accumulated data for all TPs.
  3. Updated Sink backpressure handling - SinkBackpressureError now pauses the whole assignment instead of certain partitions only.
  4. Updated existing Sinks to follow the new API

Sinks may fail more often because they are external.
Failing early allows to skip producing changelogs and
restart faster.
- Sink.flush() now flushes all the accumulated batches
- Backpressure pauses the whole assignment instead of a single partition
Copy link
Contributor

@gwaramadze gwaramadze left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like how this simplifies the code overall.

daniil-quix and others added 5 commits March 13, 2025 11:02
Co-authored-by: Remy Gwaramadze <gwaramadze@users.noreply.github.com>
Co-authored-by: Remy Gwaramadze <gwaramadze@users.noreply.github.com>
Co-authored-by: Remy Gwaramadze <gwaramadze@users.noreply.github.com>
Co-authored-by: Remy Gwaramadze <gwaramadze@users.noreply.github.com>
- Add helper to get non-changelog TPs
- Reset PausingManager on init
@daniil-quix daniil-quix merged commit 811a9d4 into main Mar 13, 2025
3 checks passed
@daniil-quix daniil-quix deleted the feature/sinks-refactoring-for-merge branch March 13, 2025 12:01
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants