Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

chore(sqllab): Cleanup /tables/... endpoint #21284

Merged

Conversation

john-bodley
Copy link
Member

@john-bodley john-bodley commented Sep 1, 2022

SUMMARY

The /tables/... endpoint was augmented (#1466) when SQL Lab supported:

  1. Fetching tables/views across either a single or all schemas, i.e., the SEE TABLE SCHEMA functioned (if configured) without having a schema specified.
  2. Backend filtering.

however the endpoint is now always called with a specified schema and filtering (substring table name matching) is never invoked.

Given these observations this PR refactors the endpoint—by removing unnecessary logic and providing a more performant ORM query given that all table/views are from a specific schema—and reduces the size of the response payload to improve performance (especially for schemas with a large number of tables/views).

Additionally as a byproduct of the refactor this PR removes:

  1. The (now unused) get_all_table_names_in_database and get_all_view_names_in_database utility methods as well as the get_all_datasource_names DB engine spec method.
  2. The (now unused) dbs.allow_multi_schema_metadata_fetch column
  3. The non-functional MAX_TABLE_NAMES config key, i.e., the logic (which potentially was wrong) only capped the number of tables/views if and only if a filter was defined, however this was never invoked.
  4. The superset update-datasources-cache CLI command.
  5. Unnecessary keys in the response payload (to reduce the payload size).

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

Updated unit tests and manually tested.

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

cc: @justinpark

@john-bodley john-bodley force-pushed the john-bodley--refactor-tables-views branch 2 times, most recently from 880dcb9 to 0baba51 Compare September 1, 2022 04:18
@john-bodley john-bodley marked this pull request as ready for review September 1, 2022 04:26
@john-bodley john-bodley requested a review from a team as a code owner September 1, 2022 04:26
@john-bodley john-bodley force-pushed the john-bodley--refactor-tables-views branch 2 times, most recently from b7be637 to 88367ff Compare September 1, 2022 16:27
Copy link
Member

@ktmud ktmud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a couple of questions

"allow_multi_schema_metadata_fetch",
sa.Boolean(),
nullable=True,
default=True,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the default be false? I think enabling multi schema fetch for all data sources is a bigger risk than disabling it for all.

@expose("/tables/<int:db_id>/<schema>/<substr>/<force_refresh>/<exact_match>")
def tables( # pylint: disable=too-many-locals,no-self-use,too-many-arguments
@expose("/tables/<int:db_id>/<schema>/")
@expose("/tables/<int:db_id>/<schema>/<force_refresh>/")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are here, can we change force_refresh to a search params, too?

@john-bodley john-bodley force-pushed the john-bodley--refactor-tables-views branch 3 times, most recently from f13fd45 to 15f0179 Compare September 5, 2022 04:47
@codecov
Copy link

codecov bot commented Sep 5, 2022

Codecov Report

Merging #21284 (f44908c) into master (094400c) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master   #21284   +/-   ##
=======================================
  Coverage   66.55%   66.56%           
=======================================
  Files        1791     1791           
  Lines       68591    68497   -94     
  Branches     7319     7319           
=======================================
- Hits        45651    45595   -56     
+ Misses      21049    21011   -38     
  Partials     1891     1891           
Flag Coverage Δ
hive 53.03% <35.71%> (+0.09%) ⬆️
javascript 52.73% <100.00%> (ø)
mysql 80.84% <100.00%> (+0.06%) ⬆️
postgres 80.89% <100.00%> (+0.06%) ⬆️
presto 52.93% <35.71%> (+0.09%) ⬆️
python 81.36% <100.00%> (+0.06%) ⬆️
sqlite 79.43% <100.00%> (+0.05%) ⬆️
unit 51.02% <35.71%> (+0.06%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...frontend/src/components/DatabaseSelector/index.tsx 95.23% <ø> (ø)
.../CRUD/data/database/DatabaseModal/ExtraOptions.tsx 83.33% <ø> (ø)
superset/cli/update.py 0.00% <ø> (ø)
superset/config.py 91.58% <ø> (-0.03%) ⬇️
superset/dashboards/schemas.py 99.46% <ø> (-0.01%) ⬇️
superset/databases/api.py 95.49% <ø> (ø)
superset/db_engine_specs/base.py 89.56% <ø> (+1.23%) ⬆️
superset/db_engine_specs/hive.py 87.35% <ø> (+0.24%) ⬆️
superset/db_engine_specs/presto.py 87.78% <ø> (-0.18%) ⬇️
superset/db_engine_specs/sqlite.py 96.42% <ø> (-0.87%) ⬇️
... and 6 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@john-bodley john-bodley force-pushed the john-bodley--refactor-tables-views branch from 15f0179 to 3dd8fc8 Compare September 5, 2022 05:54
@justinpark
Copy link
Member

@john-bodley please mark Removes existing feature or API since this commit removes allow_multi_schema_metadata_fetch (Allow Multi Schema Metadata Fetch) from Database option.

Copy link
Member

@justinpark justinpark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@john-bodley john-bodley merged commit eac6fdc into apache:master Sep 13, 2022
@john-bodley john-bodley deleted the john-bodley--refactor-tables-views branch September 13, 2022 15:22
@EugeneTorap
Copy link
Contributor

EugeneTorap commented Sep 14, 2022

Hi @john-bodley, the PR reproduces this tables loading issue in master branch. Can you fix it in next PR?
Can we also cover the case with a unit test?

Screenshot 2022-09-14 at 14 33 03

john-bodley added a commit to john-bodley/superset that referenced this pull request Sep 14, 2022
sadpandajoe added a commit to preset-io/superset that referenced this pull request Sep 15, 2022
@betodealmeida
Copy link
Member

betodealmeida commented Sep 16, 2022

@john-bodley It looks like this broke loading tables in SQL Lab... I'm getting 404 now when it tries to load the list.

Just saw your fix.

eschutho added a commit that referenced this pull request Sep 16, 2022
Co-authored-by: Elizabeth Thompson <eschutho@gmail.com>
jinghua-qa pushed a commit to preset-io/superset that referenced this pull request Sep 19, 2022
Co-authored-by: Elizabeth Thompson <eschutho@gmail.com>
(cherry picked from commit 8c16806)
mayurnewase added a commit to mayurnewase/incubator-superset that referenced this pull request Sep 22, 2022
@john-bodley
Copy link
Member Author

Apologies this PR did not adhere to SIP-15, i.e., the database migration should have been added in a future PR and included in the subsequent major release.

@rusackas
Copy link
Member

@john-bodley someone was just pointing out that this PR broke a sync job. I think technically since this changes the endpoints/URIs of the API, this would be considered a breaking change, namely as things move from

@expose("/tables/<int:db_id>/<schema>/<substr>/")
@expose("/tables/<int:db_id>/<schema>/<substr>/<force_refresh>/")
@expose("/tables/<int:db_id>/<schema>/<substr>/<force_refresh>/<exact_match>")

to

@expose("/tables/<int:db_id>/<schema>/")
@expose("/tables/<int:db_id>/<schema>/<force_refresh>/")
@deprecated(new_target="api/v1/database/<int:pk>/tables/")

I'm not sure if there's a way to patch this to be non-breaking at this point, but maybe adding a breaking change notice to UPDATING.md would at least raise awareness.

@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 2.1.0 and removed 🚢 2.1.3 labels Mar 13, 2024
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels size/XL 🚢 2.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants