Skip to content

alexxx-db/dlt-hands-on-workshop

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 

Repository files navigation

Intorduction: Delta-Live-Tables-Hands-on-Workshop

Welcome to the repository for the Databricks 1:M Delta Live Tables Workshop!

This repository contains the notebooks that are used in the workshop to demonstrate the use of Delta Live Tables to build simple, scalable , production-ready pipelines that provides built-in data quality controls and monitoring, data pipeline logging, data lineage tracking, automated pipeline orchestration, automatic Error Handling, advanced auto-scaling, change data capture (CDC) and advanced data engineering concepts (window functions and meta-programming) into a simple pipeline. cdc_flow_new

Screen Shot 2022-07-10 at 7 18 23 PM

Reading Resources

See below links for more documentation:

Workshop Flow

The workshop consists of 4 interactive sections that are separated by 4 notebooks located in the notebooks folder in this repository. Each is run sequentially as we explore the abilities of the lakehouse from data ingestion, data curation, and performance optimizations

Notebook Summary
01-Structured Streaming with Databricks Delta Tables Processing and ingesting data at scale utilizing databricks tunables and the medallion architecture
02-Orchestrating with Delta Live Tables Changing Spark Properties, Configuring Table Properties, Optimization of Tables, Combining Batch and Incremental Tables
03. Implement CDC In DLT Pipeline: Change Data Capture (Python) Implementing Change Data Capture in DLT pipelines for accessing to fresh data
04: Meta-programming Examples of metaprogramming in DLT When to use/problems is solved How to configure
05: ML Models in DLT Pipelines Example of integratation of ML models with DLT pipelines

Setup / Requirements

This workshop requires a running Databricks workspace. If you are an existing Databricks customer, you can use your existing Databricks workspace. Otherwise, the notebooks in this workshop have been tested to run on Databricks Community Edition as well.

DBR Version

The features used in this workshop require DBR 9.1 LTS+.

Repos

If you have repos enabled on your Databricks workspace. You can directly import this repo and run the notebooks as is and avoid the DBC archive step.

DBC Archive

Download the DBC archive from releases and import the archive into your Databricks workspace.

About

Delta Live Tables Hands-on Workshop Resources

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages