-
Notifications
You must be signed in to change notification settings - Fork 206
Feat!: Add support for concurrent table diff of multiple models #4256
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Conversation
f9124c6
to
febb8fb
Compare
0270149
to
a056bcf
Compare
Thanks for the review @izeigerman addressed comments, included messages indicating which models will be diffed and which will not and added a progress bar as well as information about which models are currently being processed for table differences, before providing the full table diffs at the end. I added a video in the description of this pr of how this look |
a056bcf
to
a01c049
Compare
a01c049
to
bd71a82
Compare
yes really good points @sungchun12 , changed to provide the file path and model names that don't have grain and also to prevent the diffs from running in that case. so it currently looks like this: ![]() |
Before you merge, I want to review the UX and hammer away at likely usage patterns. Great update so far! After, I'll approve. |
bd71a82
to
b333d0f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add to the table diff guide? Something like, "Diffing multiple models at the same time"
https://sqlmesh.readthedocs.io/en/stable/guides/tablediff/?h=table+diff#diffing-tables-or-views
Found a bug. It currently diffs models that are indirect non-breaking. These should be skipped.
Directly Modified: demo__dev.incremental_model (Non-breaking)
└── Indirectly Modified Children:
├── demo__dev.full_model (Indirect Non-breaking)
└── demo__dev.full_model_example (Indirect Non-breaking)
Models to compare:
├── demo.customers
├── demo.full_model
├── demo.full_model_example
├── demo.incremental_model
└── demo.stg_customers
Noticed a UX bug when table_diff
errors out. It makes the cursor invisible in my terminal. I had to run this to see it again.
echo -e '\033[?25h'
bad selector bug. This should NOT run any diffs in this scenario.
(.venv) ➜ sqlmesh-demos git:(sung/vscode) ✗ sqlmesh table_diff prod:dev -m 'git:sungvscode'
fatal: ambiguous argument 'sungvscode...': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
Models to compare:
├── demo.stg_customers
├── demo.incremental_model
└── demo.customers
Aborted!
Calculating model differences ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.0% • pending • 0:00:09
demo.stg_customers ...
demo.incremental_model ...
demo.customers ...
How exactly does this selector flag work? Looks like only models with direct modifications and not the downstream impacts.
(.venv) ➜ sqlmesh-demos git:(sung/vscode) ✗ sqlmesh table_diff prod:dev -m 'git:sung/vscode'
Models to compare:
├── demo.stg_customers
├── demo.customers
└── demo.incremental_model
When I run a selector that fails, it shows nothing. Can you put a message that says something like, "No models match this model selector"
(.venv) ➜ sqlmesh-demos git:(sung/vscode) ✗ sqlmesh table_diff prod:dev -m 'increm*'
Also I found another bug, but this may be unrelated to this PR.
Calculating model differences ━━━━━━━━╺━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 20.0% • 1/5 • 0:00:11
demo.full_model .
demo.full_model_example .
demo.incremental_model .
demo.stg_customers . Error: 10 validation errors for RowDiff
stats.s_count
Input should be a valid number [type=float_type, input_value=None, input_type=NoneType]
For further information visit https://errors.pydantic.dev/2.11/v/float_type
stats.t_count
Input should be a valid number [type=float_type, input_value=None, input_type=NoneType]
For further information visit https://errors.pydantic.dev/2.11/v/float_type
stats.join_count
Input should be a valid number [type=float_type, input_value=None, input_type=NoneType]
For further information visit https://errors.pydantic.dev/2.11/v/float_type
stats.null_grain_count
Input should be a valid number [type=float_type, input_value=None, input_type=NoneType]
For further information visit https://errors.pydantic.dev/2.11/v/float_type
stats.full_match_count
Input should be a valid number [type=float_type, input_value=None, input_type=NoneType]
For further information visit https://errors.pydantic.dev/2.11/v/float_type
stats.item_id_matches
Input should be a valid number [type=float_type, input_value=None, input_type=NoneType]
For further information visit https://errors.pydantic.dev/2.11/v/float_type
stats.num_orders_matches
Input should be a valid number [type=float_type, input_value=None, input_type=NoneType]
For further information visit https://errors.pydantic.dev/2.11/v/float_type
stats.updated_at_matches
Input should be a valid number [type=float_type, input_value=None, input_type=NoneType]
For further information visit https://errors.pydantic.dev/2.11/v/float_type
stats.s_only_count
Input should be a valid number [type=float_type, input_value=None, input_type=NoneType]
For further information visit https://errors.pydantic.dev/2.11/v/float_type
stats.t_only_count
Input should be a valid number [type=float_type, input_value=None, input_type=NoneType]
For further information visit https://errors.pydantic.dev/2.11/v/float_type
b333d0f
to
ff00e49
Compare
Thanks for the review @sungchun12 if you want to have another look, addressed comments and updated the docs
yes added in the table diff guide details as well as example use cases of the selectors
addressed it by comparing the data hash instead so that indirect non breaking models are skipped
good catch! added a try catch handle that in this cases properly closes the Live rich console so that the terminal settings are restored such as the cursor
we previously didn't handle the error that was caused by running the git cli command. revised so that we capture the error and display it in a more clean way to the user and raise an appropriate error and not diff
yes in the case of the git selector to get downstream of upstream dependencies of these models it should be used with the
yes added a relevant message that also displays the selectors used
yes I believe this is related to this issue: #4310 that Erin fixed a couple of hours ago |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple suggestions on docs and one more question that may be a bug. If not a bug, I'll approve!
Thanks for removing the models without changes section! Can you make sure to include that same warning message when no models match criteria?
Looks like for git selectors, The bug examples below.
|
Co-authored-by: Sung Won Chung <sungwonchung3@gmail.com>
This reverts commit 447c4e4.
938a078
to
9f2091c
Compare
This update extends using the table_diff command with
--select-model
to diff a subset or selection of SQLMesh models, fixes: #4198diff.mp4
For example with the downstream + to diff
items
and all models downstream of it:sqlmesh table_diff prod:dev sushi.items+
With the * wildcard to diff all models of a schema:
sqlmesh table_diff prod:dev 'sushi.*'
Similarly with all the selectors: https://sqlmesh.readthedocs.io/en/latest/guides/model_selection/