-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Add include_data_types
argument to generate_model_yaml
macro
#122
Conversation
Set to true as default; update tests
include_data_types
argument to generate_model_yaml
macro
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@linbug Thank you for picking this up! This will be a great addition for folks who start using model contracts in v1.5
macros/generate_model_yaml.sql
Outdated
{% if parent_column_name %} | ||
{% set column_name = parent_column_name ~ "." ~ column.name %} | ||
{% else %} | ||
{% set column_name = column.name %} | ||
{% endif %} | ||
|
||
{% do model_yaml.append(' - name: ' ~ column_name | lower ) %} | ||
{% if include_data_types %} | ||
{% do model_yaml.append(' data_type: ' ~ column.data_type | lower) %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depending on the adapter / data platform, we may want this to be column.dtype
instead of column.data_type
— I believe that will enable this to return text
or varchar
instead of character varying(1234)
.
We could use the dbt.format_column
macro, which is also what's used when doing the column type assertion for contracted models.
{% set formatted = adapter.dispatch('format_column', 'dbt')(column) %}
{% do model_yaml.append(' data_type: ' ~ formatted['data_type'] | lower) %}
If we do make this change, the format_column
macro is new in v1.5, so this would require a version bump to codegen
and to its require-dbt-version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jtcohen6 good point about column.dtype
vs. column.data_type
!
Using dbt.format_column
macro
I like your idea of using the logic within the dbt.format_column macro. I think we should aim to preserve the wide range of require-dbt-version: [">=1.0.0", "<2.0.0"]
though.
There's multiple approaches we could take to use this logic into codegen
without forcing an upgrade to 1.5.0. The quick'n'dirty option is to "vendor" it by just copy-pasting it into codegen. Another option is to inspect the dbt version, and fall back to a vendored version of the macro if it is less than 1.5.0.
Difference between dtype
and data_type
My understanding of the difference between the two is dtype
gives the database-specific data type with no size, scale, or precision (like varchar
or decimal
) whereas data_type
is intended to give the database-specific data type with those included (like varchar(80)
or decimal(18, 2)
).
Although dbt model contracts can operate with either input, it will effectively ignore any size/precision/scale that is supplied.
So best would be to exclude size/precision/scale to avoid any false impressions that it will be verified as part of the dbt model contract.
💡 We should fix
column.data_type
so its value is always valid. It feels very doable to make it work across adapters, and relevant issues opened here, here, and here.
💡 Once
column.data_type
is fixed, we should consider expandingdbt.format_column()
to include it as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok so I added this which I feel should work, but for some reason my tests fail locally because formatted
is empty. My dbt version is 1.5.0-b5
which is maybe pre-release and doesn't have dbt.format_columns
yet? If you have suggestions for getting around this please let me know, otherwise I can use the vendored version for everything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dbeatty10 @jtcohen6 any thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dave-connors-3 @benmosher we may want to get this across the line now that generate model is live in the IDE. would be a good improvement to the functionality.
macros/generate_source.sql
Outdated
@@ -62,7 +62,7 @@ | |||
{% for column in columns %} | |||
{% do sources_yaml.append(' - name: ' ~ column.name | lower ) %} | |||
{% if include_data_types %} | |||
{% do sources_yaml.append(' data_type: ' ~ (column.data_type | upper ) ) %} | |||
{% do sources_yaml.append(' data_type: ' ~ (column.data_type | lower ) ) %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, we may want to use .dtype
instead of .data_type
, depending on the data platform — and the format_column
macro (new in v1.5) could be our friend here.
Are you looking for additional contributors on this to help keep it moving? Happy to help however I can but obviously don't want to inject myself where I shouldn't. |
@linbug -- this looks great -- if you are able to resolve the conflicts, we can do a final review and release on this! |
Jumping in as someone who has been watching the PR for over a month now - I'm so excited that this is finally going to be merged in! This is a feature I've been waiting for before our team implements contracts, thank you all so much for working on this! |
@linbug do you need any help getting this PR over the finish line? If you don't have time I would love to help! |
I'm going to take a shot at resolving the merge conflicts and then do final review. |
@dbeatty10 you are a champion of life |
Let me know if you need help. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again for contributing this @linbug!
My apologies to you and everyone else that have been waiting for this to be merged 🙏
I pushed two main changes:
- b2f463c - rely only on the "vendored" version of
format_column
(rather than usingdbt.default__format_column
)- comparing dbt versions robustly is more trouble than it was worth in this case, so this change greatly simplified things
- 6f6aea6 - allow end users to configure their own data type formatting by using multiple dispatch to override the
data_type_format_source
and/ordata_type_format_model
macros.
Apologies for missing the pings about resolving conflicts. Thank you for merging this! |
No prob @linbug -- really appreciate the great work you did here! 🏆 |
resolves dbt-labs/dbt-core#120
Also:
include_data_types
value forgenerate_source
to Truegenerate_source
andgenerate_model_yaml
This is a:
All pull requests from community contributors should target the
main
branch (default).Description & motivation
Checklist