Skip to content

Scripts to extract, transform, and load Los Angeles Yelp data from the Yelp Fusion API and Kaggle.

Notifications You must be signed in to change notification settings

theodoremoreland/YelpETL

Repository files navigation

YelpETL

Scripts to extract, transform, and load Los Angeles Yelp data from the Yelp Fusion API and Kaggle. The process takes place in three Jupyter Notebooks, ultimately culminating in our data being persisted in a MySQL / SQLite database.

This was a group project at Washington University's Data Analytics Boot Camp (2019)

Team

  • Heather Leek
  • Theodore Moreland
  • Adam Feldstein

Data Sources / Extract

Transformations

General data cleaning such as standardizing postal codes.

SQL Tables / Load

Restaurant Inspection Data Table:

Restaurant Name
Restaurant Address
Restaurant City
Restaurant State
Restaurant Zip
Health Inspection Score
Health Inspection Grade

Yelp Restaurant Data Table

Business ID
Business Name
Business Address
Business City
Business State
Business Zip

Yelp Review Data Table

Business ID
Star Rating
Review data 

Screenshots

Python terminal processing restaurants from Yelp Fusion API calls

Python terminal processing reviews from Yelp Fusion API calls

Jupyter Notebook finished processing restaurants

Pandas restaurants dataframe

Jupyter Notebook finished processing reviews

Pandas reviews dataframe

About

Scripts to extract, transform, and load Los Angeles Yelp data from the Yelp Fusion API and Kaggle.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •