Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

support snapshot option 'invalidate_hard_deletes' #20

Merged
merged 2 commits into from
Jun 27, 2022

Conversation

willi-mueller
Copy link
Contributor

@willi-mueller willi-mueller commented Mar 30, 2022

This PR:

  1. adds the option to invalidate_hard_deletes which marks records missing in the source as deleted in the snapshot
  2. adapts snapshot_merge.sql so that it follows the same structure as the file in dbt-core

Tests

I tested snapshots manually both with and without the new configuration option invalidate_hard_deletes enabled.

{{
    config(
      unique_key='id',
      target_schema='test_schema',
      strategy='check',
      invalidate_hard_deletes=True,
      check_cols=[ 'val' ]
    )
}}

Step 0: One row (id = 1) is present in the source
Screenshot 2022-05-31 at 3 09 38 PM

Step 1: A second row (id = 2) is present in the source
Screenshot 2022-05-31 at 3 11 03 PM

Step 2: The first row (id = 1) is deleted in the source. Thus, valid_to is set by the dbt snapshot run
Screenshot 2022-05-31 at 5 38 56 PM

Step 3: The first row (id = 1) appears again in the source. Thus, it is added to the snapshot as a valid row
Screenshot 2022-05-31 at 5 39 56 PM

Step 4: The second row (id = 2) changes its value to val = 'v2'. Thus, the old record is is marked as valid_until und a new row for the current value is appended to the snapshot.
Screenshot 2022-05-31 at 5 40 54 PM

Thus it has been demonstrated that this PR satisfies the following cases:

  1. The existing functionality of tracking changes of rows over time has not been changed
  2. Hard deletes in the source are tracked as well if the option is enabled. This means that if a row does not appear in the source data, it will be marked as deleted in the snapshot
  3. Once deleted, rows can re-appear in the source and their re-appearance will be tracked in the snapshot as well.

Background

This PR aims to follow the code from dbt-core as seen here and here

@willi-mueller willi-mueller reopened this Mar 30, 2022
@willi-mueller willi-mueller marked this pull request as draft March 30, 2022 12:28
@willi-mueller
Copy link
Contributor Author

The code for this PR is complete.

I just want to add more information about my test protocol so that reviewers can reproduce it. Apart from that, the PR is done.

@willi-mueller willi-mueller changed the title ports snapshot option 'invalidate_hard_deletes' to Exasol support snapshot option 'invalidate_hard_deletes' Mar 30, 2022
@willi-mueller willi-mueller force-pushed the support-snapshot-hard-delete branch from 4ba5ae0 to b984289 Compare April 4, 2022 09:35
@willi-mueller willi-mueller marked this pull request as ready for review May 31, 2022 12:21
@willi-mueller willi-mueller force-pushed the support-snapshot-hard-delete branch from b984289 to 48b9bf0 Compare June 1, 2022 10:18
@tglunde
Copy link
Owner

tglunde commented Jun 27, 2022

all good

@tglunde tglunde merged commit 78e4be5 into tglunde:master Jun 27, 2022
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants