Skip to content

Machine Learning Project on Drought level Classification

Notifications You must be signed in to change notification settings

Piyush1729/CMPE257-Project

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

96 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Team 5

Raj Bharatbhai Choksi - rajchoksi1997

Piyush Gade - Piyush1729

Alekhya Pasupuleti - AlekhyaPasupuleti

Mohamed Shafeeq Usman - MdShafeeqU

Project Title: Comparative Study of Machine Learning models in Drought Prediction

Problem Statement:

Using machine learning models to predict drought and comparing their results using various performance metrics.

Abstract

Agriculture is an important part of the US economy. According to the US government agriculture contributed $1.5 trillion to the economy in 2020 which is a 5% share. However global warming and changes in climate leads to significant drought in various parts of the country which adversely affects agriculture. Unlike other natural disasters, Drought develops slowly and has long term consequences. Hence by leveraging machine learning we can help farmers in taking preventive measures and minimize their loss.

Our aim is to provide a comparative study on the performance of different machine learning models in predicting five levels of drought ranging from moderate to extreme using meteorological data. Weather conditions and precipitation levels at different heights from the sea level will play an important indicators for predicting droughts. We aim to use supervised learning models such as Random Forest, Decision Tree, KNN and Logistic Regression for the study and compare their results using performance metrics such as F1 score, accuracy, recall, precision and ROC curve.

We will be using dataset from the US drought monitor which provides drought data and meteorological statistics from year 2000 onwards. The dataset is updated on a weekly basis. As a part of preprocessing of the dataset we dropped the null values and removed special characters from numerical columns which left us with approximately three millions rows in our dataset. Additionally, There are twenty feature and one target variable which gives score related to the severity of drought. Out of the twenty features one is categorical and the rest are continuous. The only categorical feature is the date hence we transformed it into three numerical features namely day, year and month. We featured binned target variable into six classes and plotted histogram for all the features Additionally we performed Univariate and bivariate analysis. Lastly we understood the correlation between independent variables.

Insights and plots after preprocessing of dataset are included in the DroughtPrediction.pdf file.

Dataset Link: https://drive.google.com/file/d/1pcIRbSgmF6jwd7yemi2dk7SbqjJePqI_/view?usp=sharing

About

Machine Learning Project on Drought level Classification

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 98.8%
  • Python 1.2%