Home Credit Default Risk is a dataset competition hosted by Kaggle.
Use the package manager pip to install the dependencies.
pip install requirements.txt
There are three main notebooks in total:
data_aggregation.ipynb
eda.ipynb
model_selection.ipynb
Running them in order will allow you to compile the augmented dataset along with the complete model training process
The data files are quite large and are not located in the repository. The following files are included in the data folder that are used to run the notebooks:
application_test.csv
application_train.csv
bureau.csv
bureau_balance.csv
credit_card_balance.csv
installment_payments.csv
previous_application.csv
A final report detailing the entire workflow for this project can be found in the repository