-
Notifications
You must be signed in to change notification settings - Fork 45
03 Metadata‐driven Orchestration
The Extract Load Transform (ELT) framework is a metadata-driven orchestration tool designed for modern cloud data platforms. It simplifies ingestion and transformation pipelines, ensuring a consistent development experience and ease of maintenance. The framework supports batch ingestion and is extensively tested with Microsoft Fabric and Azure managed services like Azure Databricks and Azure Synapse. It uses an ANSI-compatible control database as the metadata repository.
-
Configurable and Extendable: Easily adapt the framework to meet specific needs.
-
Data Source Agnostic: Ingest data from various sources such as databases, Delta Lake, REST API, flat files, JSON, XML, without storing connection strings as metadata.
-
Delta and Full Loads: Support for both incremental and full data loads.
-
Re-run and Retry Capability: Automatically handle failures without manual intervention.
-
In-built Audit Tracking: Track data processing activities with built-in audit capabilities.
-
Extended Audit Capability: Enhance audit tracking with Azure PaaS services like Diagnostic Logging.
-
Eliminates Manual Data Patching: Streamline data processing by removing the need for manual interventions.
-
Data Lineage Support: Maintain data lineage throughout the data lifecycle.
-
Level 1 and Level 2 Transformations: Support for one-to-many and many-to-many transformations.
-
On-demand Pipeline and Transformation Management: Enable or disable pipelines and transformations as needed.
-
Clone or Fork the Repository: Start by cloning or forking the repository from github.com/bennyaustin/elt-framework.
-
Run the ControlDB Deployment: The GitHub Action bennyaustin/elt-framework/blob/main/.github/workflows/ControlDB-deployment.yml executes the workflow to deploy controlDB objects.
Pre-Requisite:
This GitHub Action requires the following repository secrets:
- *CLIENT_ID: Client/Application ID of the Service Principal.
- CLIENT_SECRET: Service Principal Secret.
- CONTROLDB_CONNECTIONSTRING: controlDB connection string in service principal authentication format.
Server=<SQL Server>;Authentication=Active Directory Service Principal; Encrypt=True;Database=controlDB;User Id=<Service Principal Client/Application ID>;Password=<Service Principal Secret>
- SUBSCRIPTION_ID: Azure Subscription ID of controlDB
- TENANT_ID: Entra Tenant ID of controlDB
Hit the Run Workflow on GitHub action ControlDB-deployment.yml to deploy database objects.
For more details, visit ELT Framework Wiki