Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Updated generate_model_yaml macro to correctly handle nested bigquery… #54

Merged
merged 5 commits into from
Jun 23, 2022

Conversation

bodschut
Copy link
Contributor

@bodschut bodschut commented Feb 2, 2022

This is a:

  • bug fix PR with no breaking changes — please ensure the base branch is main
  • new functionality — please ensure the base branch is the latest dev/ branch
  • a breaking change — please ensure the base branch is the latest dev/ branch

Description & motivation

Fixes #27

Checklist

  • I have verified that these changes work locally
  • I have updated the README.md (if applicable)
  • I have added tests & descriptions to my models (and macros if applicable)
  • I have added an entry to CHANGELOG.md

@joellabes
Copy link
Contributor

joellabes commented Mar 22, 2022

Hi @bodschut, thanks for opening this - as you noted in #27, we'll definitely want some integration tests to ensure everything's working correctly.

Have a look at https://github.com/dbt-labs/dbt-codegen/tree/main/integration_tests/tests for some examples - you can create another source table following this pattern - you'll need to only run it if the current adapter is BigQuery of course (see target.type)

Copy link
Contributor

@dbeatty10 dbeatty10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bodschut thanks for submitting this! 🏆

I added some tests and everything looks good ✅

@dbeatty10 dbeatty10 merged commit 40c77d4 into dbt-labs:main Jun 23, 2022
{% set column_name = column.name %}
{% endif %}

{% do model_yaml.append(' - name: ' ~ column.name | lower ) %}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for poking an old PR but shouldn't this line read(?):

{% do model_yaml.append('      - name: ' ~ column_name  | lower ) %}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure one way or the other @Zatte

@bodschut Some questions:

  1. What do you think of @Zatte 's question about column_name versus column.name?
  2. Is there something more descriptive than column_name that would be accurate, like aliased_parent_column_name?
  3. Is parent_column_name even used as a parameter? It's not explicitly set here, and the generate_column_yaml looks like a "private" macro internal to codegen rather than a "public" macro intended to be used by consumers.

Copy link

@Zatte Zatte Dec 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For context; I use generate_model_yaml-source (which calls this private macro) on a nested struct in BigQuery

With the current state i only get "tail columns" as the column names (not the full parent hierarchy, dot delimited). I would expect the column names to contain the full parent hierarchy.

Example.
SQL

SELECT 
   struct(
      "a" as a, 
      "b" as b
   ) as substruct

Current output

models:
  - name: model_struct
    description: ""
    columns:
      - name: a
        description: ""

      - name: b
        description: ""

Expected output (with column_name change)

models:
  - name: model_struct
    description: ""
    columns:
      - name: substruct.a
        description: ""

      - name: substruct.b
        description: ""

The current output will be ambiguous in the case of multiple sub-structs (nested columns) with the same name like this.

SELECT 
   struct(
      "a1" as a, 
      "b1" as b
   ) as substruct1, 
   struct(
      "a2" as a, 
      "b2" as b
   ) as substruct2

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This info is perfect @Zatte !

I was able to reproduce and see that your suggested fix works.

Opened this issue and will follow-up with a PR:
#98

I took a peek at the commit history on this -- looks like @bodschut had it correct the first time around, and I accidentally foobar'd it when resolving a merge conflict 😬

jeremyholtzman pushed a commit that referenced this pull request Apr 10, 2023
#54)

* Updated generate_model_yaml macro to correctly handle nested bigquery fields

* Updated changelog.md

* Update macros/generate_model_yaml.sql

* Integration tests

Co-authored-by: Bob De Schutter <bob.deschutter@digitalswat.be>
Co-authored-by: Doug Beatty <44704949+dbeatty10@users.noreply.github.com>
Co-authored-by: Doug Beatty <doug.beatty@dbtlabs.com>
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Generate_model_yaml doesn't recognize nested BigQuery columns
4 participants