Skip to content

02 Architecture

Benny Austin edited this page Dec 27, 2024 · 12 revisions

Architecture Diagram

The accelerator uses a medallion architecture. The medallion layers can be configured as either files, Lakehouse, or Data Warehouse in OneLake. In this setup, the bronze layer consists of files, the silver layer is a Lakehouse, and the gold layer is a Data Warehouse

Architecture Diagram

Dataflow and Components

  1. The Data Factory pipelines ingest data from both cloud and on-premises sources into OneLake bronze layer. The on-premises sources need an OPDG.
  2. Data lands in the bronze layer in OneLake as files, where possible in parquet format as-is, without any transformation.
  3. The Spark notebooks then transform the raw data from the bronze layer. The curated data is then stored in silver layer of OneLake as Lakehouse tables. Here, the data is cleansed, flattened, and standardized while maintaining its grain. The bronze data can be transformed into one-to-one or one-to-many Lakehouse table(s).
  4. The Data Warehouse stored procedures apply business rules to data from the Lakehouse tables in the silver layer. It lands the data as DW tables in the gold layer of OneLake. Here, typical activities include applying custom business rules, creating snapshots, merging data from multiple tables, and creating hub-spoke star schema. A Lakehouse table from the silver layer can be transformed into one-to-one, one-to-many, or many-to-one DW tables in the gold layer.
  5. Semantic models built on the gold layer DW tables serve as the analytics layer. This analytics layer is sometimes referred as the diamond layer. Here the relationships between tables are established.
  6. The orchestration of this fabric accelerator is underpinned by the ELT Framework, a metadata-driven orchestration tool that streamlines ingestion and transformation pipelines.
  7. The ELT framework uses an Azure SQL serverless database for metadata, mirrored into the Fabric workspace. Semantic models built from ELT metadata offer real-time reporting via direct lake Semantic Models.
  8. Power BI serves as the analytics layer, supported by PBI Copilot for self-service capabilities.
Clone this wiki locally