How should the SDK deal with settings for load_schema
and schema_mappings
for SQL-based targets
#1084
Replies: 3 comments 5 replies
-
I believe #1036 partially implements the Currently the SDK does not provide a way to set the schema/search path on the connection, without relying on a snippet as @BuzzCutNorman suggests here or the inclusion of To resolve this in #1036 I suggest we:
e.g. engine = sqlalchemy.create_engine(
self.sqlalchemy_url,
connect_args=self.config.get('sqlalchemy_connect_args', {}),
echo=False
) with the config snippet: sqlalchemy_url:
sqlalchemy_connect_args:
options: "-csearch_path=my_schema" We could then (additionally and in a future PR) support a @aaronsteers @BuzzCutNorman WDYT? |
Beta Was this translation helpful? Give feedback.
-
@aaronsteers no super strong feelings. I like |
Beta Was this translation helpful? Give feedback.
-
Notes from Office Hours 2/11/2022:
Outcome: deliver |
Beta Was this translation helpful? Give feedback.
-
Discussion on spec to implement here:
schema_mapping
#1086 (comment)Pipelinewise has a precedent of using
default_target_schema
for the default load schema, andschema_mappings
to allow overrides/remappings based on the upstream source system name.For ref: https://github.com/transferwise/pipelinewise-target-snowflake#configuration-settings
Other implementations use
schema
as the setting name instead ofdefault_target_schema
, but I don't personally prefer this becauseschema
is inherently a highly overloaded term and often applied for JSON schema (which this isn't). The namesdefault_target_schema
and/orload_schema
both seems more clear to me.I personally slightly prefer
load_schema
overschema
ordefault_target_schema
for the name of the default load schema setting - but I don't feel very strongly about it and I can see some benefit of using thedefault_target_schema
setting name, which has significant precedent. I don't see any need to renameschema_mappings
because it also seems clear, concise, and relatively intuitive - although admittedly we don't need to add this feature immediately.Now that we are building out the SQL target capabilities in the SDK, would be helpful to put some thought into the naming here.
Does anyone else have strong feelings about this?
cc @tayloramurphy, @edgarrmondragon, @kgpayne
From the Pipelinewise docs about the behavior of each of their 2 settings:
default_target_schema
- Name of the schema where the tables will be created, without database prefix. Ifschema_mapping
is not defined then every stream sent by the tap is loaded into this schema.schema_mapping
- Useful if you want to load multiple streams from one tap to multiple Snowflake schemas. If the tap sends thestream_id
in<schema_name>-<table_name>
format then this option overwrites thedefault_target_schema
value.Beta Was this translation helpful? Give feedback.
All reactions