This repository contains all the projects I created as part of the Udacity Data Engineering Nanodegree.
The following topics were covered during the course:
- Data warehouse modeling with PostgreSQL and Apache Cassandra.
- Fundamentals of relational/SQL and non relational/NoSql databases
- Cloud data warehousing
- Data Lake fundamentals
- ETL pipeline automation
The following technologies were used:
- Python
- PostgreSQL
- Apache Cassandra
- AWS S3 / Redshift / EMR / EC2
- Apache Spark
- Apache Airflow