-
Notifications
You must be signed in to change notification settings - Fork 3
[EWT-1250] Sqlalchemy/Superset expectations from python-DBAPI #17
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
base: main
Are you sure you want to change the base?
Conversation
wherobots/db/connection.py
Outdated
schema = reader.schema | ||
columns = schema.names | ||
column_types = [field.type for field in schema] | ||
rows = reader.read_all().to_pandas().values.tolist() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need both read_all().to_pandas()
or can you read_pandas()
directly? (to be fair according to the docs it does the same thing)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! Using read_pandas() now for better readability.
wherobots/db/driver.py
Outdated
}, | ||
headers=headers, | ||
if ws_url: | ||
session_uri = ws_url |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple of things here:
- This can be done earlier, so if we feel confident in using the
ws_url
we can do that right after checking the token/api-keys. - This needs a little bit different logging, since we're not requesting a new runtime (see line 70).
- This should early return so that we don't have to indent everything afterwards. Makes it more readable.
@@ -51,6 +50,7 @@ def connect( | |||
results_format: Union[ResultsFormat, None] = None, | |||
data_compression: Union[DataCompression, None] = None, | |||
geometry_representation: Union[GeometryRepresentation, None] = None, | |||
ws_url: str = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this required, instead of calling connect_direct()
directly?
The reason this parameter wasn't included in connect()
is that it creates ambiguity between the rest of the parameters (like runtime/region) and the runtime you'd actually connect to when providing a ws_url
, which may not match those choices.
query.handler(json.loads(result_bytes.decode("utf-8"))) | ||
data = json.loads(result_bytes.decode("utf-8")) | ||
columns = data["columns"] | ||
column_types = data.get("column_types") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is column_types
optional? If so, then it's good that you're using data.get()
here, but then in Cursor.__get_results
you expect column_types
to be non-None. You either need to ensure column_types
is always provided, or change __get_results
to be more defensive.
None, # precision | ||
None, # scale | ||
True, # null_ok; Assuming all columns can accept NULL values | ||
) | ||
for col_name in result.columns | ||
for i, col_name in enumerate(columns) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use https://docs.python.org/3/library/functions.html#zip to avoid jumping hoops with an index (it's much nicer to read, and also more efficient):
self.__description = [
(
col_name,
_TYPE_MAP.get(col_type, 'STRING'),
...
)
for (col_name, col_type) in zip(columns, column_types)
]
This PR introduces the following requirements that have risen from the Wherobots x Superset integration -
Row
object.rollback()
andcommit()
to be implemented. Other OLAP databases such as pyhive simple "pass" the not implementedrollback()
andcommit()
methods. For context - Superset's background processes often bypass the SQLAlchemy dialect and directly interacts with DBAPI. This is why overriding the rollback and commit methods in the Dialect doesn't suffice.ws_url
, toconnection
. This helps maintain static connection pool configuration in Superset.