Set unique table suffix to allow parallel incremental execution #810

huangxingyi-git · 2024-09-26T13:50:52Z

Describe the feature

For some specific cases (eg. backfill very large amount of data), we need to execute parallel multiple dbt run of specific incremental(replace_where) model in which we pass the date (or country) as var argument.
For example, we have a model we run every day using Airflow for which we pass the a date relative to the Airflow scheduler.
FYI
https://github.com/dbt-labs/dbt-athena/pull/650/files

If we want to process by batch of N days in parallel using Airflow concurrency, we need the tmp table create by each of the dbt run to be unique. Else, you are going to end up with N insert attempting to run with the same __dbt_tmp name, creating conflict and ultimately creating failure.

Who will this benefit?

For those who uses repalce_where as incremental strategy.
Example Use Case: Run the same incremental model concurrently with different --vars in order to parallelly insert multiple data partitions

Are you interested in contributing this feature?

I am interested in contributing to this feature if needed.

The text was updated successfully, but these errors were encountered:

benc-db · 2024-09-27T17:13:19Z

This issue is going to solved more comprehensively with dbt-labs/dbt-core#10672; however, I'll take a look at your PR, and assuming it doesn't hurt any other use case, I'm not against taking this in the short term.

huangxingyi-git added the enhancement New feature or request label Sep 26, 2024

huangxingyi-git mentioned this issue Sep 26, 2024

Set unique temp table suffix to allow parallel incremental executions #811

Merged

3 tasks

benc-db closed this as completed Oct 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set unique table suffix to allow parallel incremental execution #810

Set unique table suffix to allow parallel incremental execution #810

huangxingyi-git commented Sep 26, 2024 •

edited

Loading

benc-db commented Sep 27, 2024

Set unique table suffix to allow parallel incremental execution #810

Set unique table suffix to allow parallel incremental execution #810

Comments

huangxingyi-git commented Sep 26, 2024 • edited Loading

Describe the feature

Who will this benefit?

Are you interested in contributing this feature?

benc-db commented Sep 27, 2024

huangxingyi-git commented Sep 26, 2024 •

edited

Loading