This project has been conducted as a part of Udacity's Data Scientist Nanodegree. The Students were to build:
- An ETL pipeline to load csv files, transform and clean data and load to a sql database.
- A Machine learning pipeline
- A web app that puts it all togheter and displays some visualisations for the data and predict categories for new messages
The complete project can be seen on www.brønstad.com/disaster
- python 3.6
- Flask
- numpy
- pandas
- sklearn
- nltk
- sqlalchemy
- re
- seaborn
- pickle
Run the following commands in the root directory.
- Run the ETL pipeline to load data from csv files, clean and load into a database as specified in the third arg
- python data/process_data.py data/disaster_messages.csv data/disaster_categories.csv data/DisasterResponse.db
- Run a ML pipeline to trains and saves a model
- python models/train_classifier.py data/DisasterResponse.db models/classifier.pkl
- Run run.py and go to http://0.0.0.0:3001/
The project contains a jupyter notebook for the ETL pipeline aswell as on for the ML pipeline. a folder with data, 1 folder for the webapp and a folder for the model
Thanks to Figure Eight for the dataset