"Brazilian E-commerce ETL data pipeline" project focuses on building an ETL process to handle and analyze e-commerce data from Brazil. The project involves extracting data from MySQL, processing and transforming it through various stages, and visualizing it using Power BI.
- Data Extraction: Extract data from MySQL and store it as assets in the bronze layer.
- Data Transformation: Transform data using Pandas.
- Data Loading: Load transformed data into PostgreSQL.
- Data Visualization: Connect Power BI with PostgreSQL and create simple charts to visualize data.
Create env and .env file in your project directory and add your environment variables.
Download the corresponding data into the ingest_data/data/ directory
Start the MinIO, MySQL, and PostgreSQL containers. During this process, raw data is also uploaded to MySQL
docker-compose -f docker-compose-storage.yml build
docker-compose -f docker-compose-storage.yml up -d
Start Dagster container and materialize all assets through the Dagster UI.
docker-compose -f docker-compose-dagster.yml build
docker-compose -f docker-compose-dagster.yml up -d
Open your Power BI Desktop, connect to the PostgreSQL databases and create dashboard.
Data Processing: Python
Database and Data Storage: MySQL, PostgreSQL, MinIO
Ochestration: Dagster
Visualization: Power BI
Containerization: Docker