DataHub v0.8.45
Release Highlights
User Experience
- Allow Term Groups to be the target of permissions
- Customize browser favicon via
REACT_APP_FAVICON_URL
param - Some UX improvements for charts & dashboards entity pages to reduce confusion
- Performance improvements on the lineage visualization
- Search bar for dataset schema tab
Developer Experience
- Add rest endpoint for restoring indices of a single entity (/aspects?action=restoreIndices)
- Create new platform instances via CLI
- Improved impact analysis performance due to an added caching layer
- Support for Patch as seen in August 2022 town hall.
Metadata Ingestion
- Introduces bigquery-beta source
- Looker source memory usage dramatically reduced
- Report memory usage during ingestion
- Improve Tableau lineage
- Usage statistics for Tableau
- LookML can automatically clone your Git repository. LookML is now supported in UI-based ingestion.
- dbt supports column-level meta mappings
- Support for deletion & rollback of time series data
- Upgrade to browse path forms
[see next page for list of commits]
What's Changed
- fix(privileges) Add Term Groups as targetable entities for privileges by @chriscollins3456 in #5806
- fix(javadocs): remove ampersand from pdl causing issue in doc generation for openapi by @RyanHolstien in #5808
- chore(ingest): remove archived docs by @hsheth2 in #5793
- feat(ingest): add rewrite option for metadata file check by @hsheth2 in #5763
- feat(cli): add support for sampled reporting to keep logs manageable by @shirshanka in #5800
- docs(refactor): Refactor Tags Feature Guide by @maggiehays in #5781
- docs(feature-guide) Impact Analysis by @maggiehays in #5765
- feat(theming): set custom favicon via env var by @gabe-lyons in #5810
- test(smoke-test): check debug arg in executor requests by @hsheth2 in #5811
- fix(ingest): bigquery-beta - Fixing dependencies by @treff7es in #5814
- feat(ingest): looker - reduce memory requirements by @shirshanka in #5815
- feat(restore-indices): add endpoint for restore indices, add basic check for graph by @anshbansal in #5805
- fix(frontend): download node only when USE_SYSTEM_NODE is set to false by @szalai1 in #5817
- doc: Make Airflow link clickable by @daha in #5803
- feat(ingest):looker - reduce mem usage, misc reporting improvements by @shirshanka in #5823
- feat(model, ingest): populate sizeInBytes in snowflake, fall back to table level profiling for large tables by @mayurinehate in #5774
- chore(docker): make curl/wget commands quiet in docker by @hsheth2 in #5819
- chore: cleanup references to the old ember app by @hsheth2 in #5797
- fix(ingest): spark-lineage: Adding additional debug logs to spark lineage by @treff7es in #5772
- fix(docker): add missing port mappings for non-neo4j quickstart by @hsheth2 in #5799
- fix(ingest): looker - report dashboard scanning correctly by @shirshanka in #5829
- feat(cli): report memory usage during ingest by @shirshanka in #5828
- fix(ingest): presto-on-hive - Fixing mysql filter by @treff7es in #5825
- docs(big query): add needed delete permission to list by @maaaikoool in #5826
- chore(ingest): set isort combine_as_imports by @hsheth2 in #5820
- fix(ingest): use
AwsConnectionConfig
instead ofAwsSourceConfig
by @hsheth2 in #5813 - feat(ingest): looker test connection by @hsheth2 in #5768
- feat(ingest): improve tableau lineage, workbooks query, fix pagination by @mayurinehate in #5756
- fix(ingest): profiling - memory usage reduction by @shirshanka in #5830
- feat(monitoring): enable JMX and OTEL for frontend pods by @szalai1 in #5834
- fix(standalone-consumers): Exclude Solr from spring boot application config & make them run on M1 by @pedro93 in #5827
- feat(hooks): Add toggle for enabling/disabling platform event hook by @pedro93 in #5840
- feat(transformers): Add semantics & transform_aspect support in transformers by @mohdsiddique in #5514
- feat(ci): auto label PRs by @anshbansal in #5839
- feat(inputs): improving clarity on inputs for dashboards by @gabe-lyons in #5841
- feat(ingest): add utility for converting MCEs to MCPs by @hsheth2 in #5812
- chore(smoke): add additional log in smoke test by @hsheth2 in #5842
- fix(ingest): fix doc generation import ordering issue with postgres by @hsheth2 in #5846
- feat(docker) Adds Sasl support to base ingestion image by @pedro93 in #5855
- fix(graphql) Fix null pointer exception when fetching entity aspect via graphql by @chriscollins3456 in #5857
- fix(ingest): reporting should work with timestamps by @shirshanka in #5860
- fix(patch-entity-registry): Remove exception for entities with key aspects. by @pghazanfari in #5831
- fix(browse): Fixing browse path to remove requirement for simple name suffix by @jjoyce0510 in #5634
- fix(ingest): bigquery - Fixing sharded regexp pattern config by @treff7es in #5861
- perf(elastic search graph service): improving perf of lineage query by @gabe-lyons in #5858
- chore(ingest): remove outdated GE compatibility hack by @hsheth2 in #5862
- ci(ingest): test with python 3.10 by @hsheth2 in #5863
- docs: improve doc generation, add better docs for snowflake, looker by @shirshanka in #5867
- feat(ci): tweak auto-label globs by @anshbansal in #5849
- fix(m1): preflight works with brew postgres@14 by @shirshanka in #5868
- feat(smoke-tests) Make smoke tests use standalone consumers by @pedro93 in #5856
- fix(domains): adding 10,000+ text when domain list caps out elastic count capacity by @gabe-lyons in #5838
- docs(notifications): slack notification docs by @anshbansal in #5871
- feat(docker): Update Dockerfiles to use java 11 runtime by @pedro93 in #5853
- Scroll issue on Glossary related entity page by @Ankit-Keshari-Vituity in #5804
- fix(ingest): include urns in rest sink failure logs by @hsheth2 in #5848
- fix(docker): Bumps JRE 11 to latest by @pedro93 in #5875
- feat(ingest): support reading config file from stdin by @hsheth2 in #5847
- fix(ingest): remove dbt
delete_tests_as_datasets
option by @hsheth2 in #5865 - fix(ingest): avrogen handling for missing fields with default values by @hsheth2 in #5844
- refactor(ingest): add ALL_ENV_TYPES constant by @hsheth2 in #5866
- feat(cli) Make docker compose quiet by @pedro93 in #5869
- feat(datahub-protobuf): add support for shadow jar, publish by @shirshanka in #5882
- feat(jars): better jar versioning for datahub-client, spark-lineage and protobuf by @shirshanka in #5883
- fix(dev-docker): set right context for frontend dev build by @szalai1 in #5885
- fix(ci): fix jar release action dependencies by @shirshanka in #5884
- feat(schema) Add search filter to Schema tab by @chriscollins3456 in #5845
- feat(ui) Add entity search input to shared folder by @chriscollins3456 in #5876
- fix(ingest): datahub-api - move instantiation to the right config class by @shirshanka in #5878
- feat(ingest): looker - improve defaults for usage extraction now that… by @shirshanka in #5893
- fix(ingest): add missing trino types by @ms32035 in #5870
- fix(ingest): remove dbt
disable_dbt_node_creation
andload_schema
options by @hsheth2 in #5877 - fix(frontend): forward Host header as X-Forwarded-Host by @codesorcery in #5816
- feat(ingestion-ui) Sort ingestion sources by last execution time by @chriscollins3456 in #5807
- docs(schema-history): update schema history docs by @aditya-radhakrishnan in #5894
- docs(feature-guide) Refactor Domains feature guide by @chriscollins3456 in #5859
- feat(charts & dashboards): improving chart & dashboard entity page rendering by @gabe-lyons in #5864
- test(ingest): use pytest parameterization for dbt integration tests by @hsheth2 in #5879
- refactor(ingest): move aspect maps to dedicated file by @hsheth2 in #5821
- feat(ingest): make sink use type annotations by @hsheth2 in #5899
- feat(ingest): add entity type inference to mcpw by @hsheth2 in #5880
- feat(deletion & rollback): Server & Client side changes to support timeseries aspect deletion & rollback. by @rslanka in #4756
- fix(ci): fix globs for auto-labeling PRs by @anshbansal in #5903
- feat(ingest): minor changes in snowflake-beta source, add basic tests by @mayurinehate in #5910
- feat(ingestion): schema inference for jsonlines in S3 by @hieunt-itfoss in #5725
- refactor(gms): Refactoring util + entity client class locations by @jjoyce0510 in #5902
- fix(ingest): support relative paths in lookml base_folder by @hsheth2 in #5914
- feat(graphql) Add new 'entities' endpoint to batch get entities by urn by @chriscollins3456 in #5915
- fix(ingest): handle lookml directory being in a symlink'd folder by @hsheth2 in #5916
- fix(doc) - fix boolean property in json schema by @treff7es in #5919
- fix(ingest): hide internal profiler.allow_deny_patterns from config by @hsheth2 in #5619
- feat(ingestion-ui) Display ingestion sources in UI more dynamically by @chriscollins3456 in #5789
- fix(docs): Feast recipe documentation by @danilopeixoto in #5917
- fix(docker): skip zoneinfo backport on newer python versions by @hsheth2 in #5912
- fix(favicons): update multiple favicons if present by @gabe-lyons in #5927
- feat(ingest): lookml - support for git checkout by @shirshanka in #5924
- refactor(ingest): simplify DefaultSQLParser alias by @hsheth2 in #5935
- feat(ingest): support version option in datahub client get_aspect_v2 by @hsheth2 in #5934
- fix(ingest): add dbt redshift type mappings by @hsheth2 in #5933
- fix(ci): process older issues first by @anshbansal in #5926
- fix(ingest): move git requirement into lookml deps by @hsheth2 in #5932
- feat(elastic-setup): more verbose logging by @anshbansal in #5937
- fix(docker) Add platform to docker-compose command by @firasomrane in #5683
- fix(airflow): prioritize conn_type over id-based type inference by @hsheth2 in #5911
- feat(ingestion): Refactor standard state-handling tasks into a common handler that are common across all stateful ingestion sources. by @rslanka in #5766
- fix(docs): lookml - clean up links by @shirshanka in #5938
- feat(search): Add support for Elasticsearch object field type by @justinas-marozas in #5891
- feat(ingest): add ConfigEnum type by @hsheth2 in #5734
- fix(transformer): fix invalid lastModified.actor entry in transformers by @mayurinehate in #5906
- refactor(gms): Adding Java Entity Services by @jjoyce0510 in #5931
- fix(ingest): fix type annotations on some pydantic fields by @hsheth2 in #5795
- feat(docs) Adds guide on how to use personal access tokens by @pedro93 in #5873
- feat(ingestion) Add more info to glue entities by @skrydal in #5874
- feat(ingestion-ui) Add LookML form to managed ingestion UI by @chriscollins3456 in #5939
- fix(ingest): Fix snowflake usage stateful ingestion backward compatibility check. by @rslanka in #5943
- feat(ingest): skip ssh known_hosts verification for git clone by @hsheth2 in #5945
- fix(ui) Make LookML deploy key a textarea and format key by @chriscollins3456 in #5946
- feat(docs): lookml - add video for ui ingestion, remove secret step u… by @shirshanka in #5948
- fix(ingest): add trailing newline to ssh keys by @hsheth2 in #5947
- docs(airflow): show that rest host points at gms by @hsheth2 in #5913
- feat(delta-lake): support delta tables from minio by @MugdhaHardikar-GSLab in #5758
- refractor(snowflake): move snowflake-beta to certified snowflake source by @mayurinehate in #5923
- docs: datahub_conn_id => conn_id in Airflow Integration by @GyuhoonK in #5920
- fix(docs): fix loom embed by @shirshanka in #5956
- fix(build): fix preflight script for m1 for lib postgresql by @shirshanka in #5957
- docs(airflow) add back auth token info for airflow by @hsheth2 in #5960
- feat: qualifiedName support + populating glue ARN by @skrydal in #5952
- feat(elastic-setup): add better error handling by @anshbansal in #5963
- docs(ingest): add example of ssl with mysql by @hsheth2 in #5954
- feat(ingestion-ui) Use CLI version defined in recipe for test connection by @chriscollins3456 in #5959
- feat(ingest): add support for aliases in plugin registry by @hsheth2 in #5958
- feat(elasticsearch-setup): Add insecure option for curl by @BogdanAntoniu78 in #5887
- feat(ci): exempt from stale label by @anshbansal in #5964
- fix(ingest): continue validation of s3 path_specs even if platform is set by @mayurinehate in #5951
- fix(ui) Filter search through schemas based on non-editable tags and terms as well by @chriscollins3456 in #5942
- refactor(ingest): use typed objects in dbt ingestion by @hsheth2 in #5929
- feat(ingest): bigquery-v2 - Add profiling feature by @treff7es in #5953
- fix(quickstart): elasticsearch-setup script fails on curl by @shirshanka in #5975
- feat(protobuf): support for custom platform, subtypes, misc improvements by @shirshanka in #5973
- feat(dashboard): add subTypes aspect to dashboard entity by @Masterchen09 in #5843
- fix(patch-entity-registry): Use AspectSpec to retrieve aspect class in order to support custom entity ingestion via patch mechanism. by @pghazanfari in #5936
- fix(ingest): handle
uname
missing on windows by @hsheth2 in #5981 - ci: reduce pip backtracking in airflow plugin by @hsheth2 in #5982
- fix(ingest): pin great-expectations dependency to fix profiling error by @mayurinehate in #5980
- refactor(looker): Migrating to new browse path format by @jjoyce0510 in #5688
- feat(ingest): dbt column-level meta mappings +
add_terms
operation by @hsheth2 in #5970 - fix(datahub-ranger-plugin): add support to publish jars by @shirshanka in #5983
- feat(docs): Adds docs on K8s scheduled ingestion by @pedro93 in #5984
- feat(ingest): presto-on-hive stateful, catalog and pascalcase subtype config by @aezomz in #5890
- fix(kafka-setup) Bump kafka version to existing version by @pedro93 in #5995
- Small UI bugs by @Ankit-Keshari-Vituity in #5886
- feat(data platform) adding data platform indexing & select platform modal in frontend by @gabe-lyons in #5988
- Worked on the pop-over of matched entity by @Ankit-Keshari-Vituity in #5851
- refactor(docs): Updating GraphQL documentation with fixes, and more examples by @jjoyce0510 in #5989
- docs(upgrading-datahub): Add docs for upgrading-datahub v0.8.44 by @jjoyce0510 in #5998
- chore(docs-website): bump docusaurus to v2.1.0 by @hsheth2 in #5792
- docs: remove wip UI ingestion doc by @hsheth2 in #6000
- feat(cli): extend put command to support platform creation by @shirshanka in #5990
- feat(ingest): presto-on-hive - Description and hive view support by @treff7es in #5993
- fix(ingest): encode reserved characters when creating dataset urn by @mayurinehate in #5977
- feat(ingestion): Tableau dashboard & chart usage stats by @mohdsiddique in #5999
- feat(ingestion): Documentation on adding stateful ingestion use-cases to new sources by @rslanka in #5985
- refactor(ingest): streamline two-tier db config validation by @hsheth2 in #5986
- feat(ui): Add Nullable label to Schema Render by @mkamalas in #5909
- refactor(frontend): Misc frontend improvements by @jjoyce0510 in #6012
- fix(docs-site): make banner a bit smaller by @hsheth2 in #6016
- feat(ingest): add --no-pull-images option to docker quickstart by @hsheth2 in #6017
- docs(users): move changing default user up by @anshbansal in #6020
- Worked on the dynamic column width of stats table by @Ankit-Keshari-Vituity in #5996
- Worked to hide the recent search from search bar by @Ankit-Keshari-Vituity in #6021
- docs: fix a markdown typo by @ltxlouis in #6019
- fix(ingest): lookml - make explore parsing more robust by @shirshanka in #6028
- fix(ingest): better error handling in glue by @hsheth2 in #6030
- fix(gms): Write back lineage search results to in-memory cache bound to feature flag by @RyanHolstien in #6006
- fix(docs-site): style custom announcement bar by @jeffmerrick in #6022
- fix(ownership-differ): standardise ownership change event payload by @ngamanda in #5967
- feat(ingestion): Add fail-safe stale entity removal via configurable 'fail_safe_threshold' param. by @rslanka in #6027
- refactor(ingest): move common host_port validation by @hsheth2 in #6009
- feat(ingest): aws - support extra args to role config by @hsheth2 in #6031
- fix(frontend): refactoring AuthServiceClient by @aditya-radhakrishnan in #6029
- refactor(gms): Improving JWT parsing logic by @jjoyce0510 in #6025
- fix(ingest): support snowflake-beta extra compatibility by @hsheth2 in #6032
- fix(platform): Add aws-secretsmanager-jdbc driver in dependencies by @atul-chegg in #5968
- feat(patch): initial support of json patch style semantics in MCPs by @RyanHolstien in #5901
- fix(ingest): bigquery-beta - Project name quote and graceful lineage/usage failures by @treff7es in #6035
New Contributors
- @pghazanfari made their first contribution in #5831
- @codesorcery made their first contribution in #5816
- @hieunt-itfoss made their first contribution in #5725
- @firasomrane made their first contribution in #5683
- @GyuhoonK made their first contribution in #5920
- @BogdanAntoniu78 made their first contribution in #5887
- @ltxlouis made their first contribution in #6019
- @atul-chegg made their first contribution in #5968
Full Changelog: v0.8.44...v0.8.45