Stroke Prediction

Introduction

Stroke is a leading cause of death and disability worldwide, affecting millions of people annually. This project aims to develop predictive models to identify individuals at high risk of stroke based on various health metrics and demographic factors. By leveraging advanced machine learning techniques, the goal is to enhance early detection and preventive healthcare strategies, ultimately reducing the incidence and impact of strokes.

Models Used

Logistic Regression

A fundamental model used for its simplicity and interpretability, providing baseline performance metrics.

K-Nearest Neighbors (K-NN)

This model was utilized to capture local patterns and clusters within the data, enhancing prediction accuracy.

Random Forest

Chosen for its ability to handle complex interactions and avoid overfitting, Random Forest provided the best performance in our evaluation. evaluation.

Here we are using GridSearchCV as it finds best hyperparameters for models with in set of parameter grid

Dataset

The dataset used for this project is sourced from Kaggle and can be accessed Kaggle | Health dataset. This dataset is mostly preprocessed, making it suitable for immediate use in modeling and analysis. It includes various health metrics and demographic factors essential for predicting stroke , and also dataset for Diabetes and Hypertension are avaiable in the same link

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md
STROKE PREDICITON.pptx		STROKE PREDICITON.pptx
classification course project.ipynb		classification course project.ipynb
stroke_data.csv		stroke_data.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Stroke Prediction

Introduction

Models Used

Logistic Regression

K-Nearest Neighbors (K-NN)

Random Forest

Here we are using GridSearchCV as it finds best hyperparameters for models with in set of parameter grid

Dataset

Random Forest demonstrated the highest accuracy among the models, with a cross-validation score of 0.997 and a test set score of 0.999

About

Releases

Packages

Languages

KundhanMiriyala/Stoke-prediction

Folders and files

Latest commit

History

Repository files navigation

Stroke Prediction

Introduction

Models Used

Logistic Regression

K-Nearest Neighbors (K-NN)

Random Forest

Here we are using GridSearchCV as it finds best hyperparameters for models with in set of parameter grid

Dataset

Random Forest demonstrated the highest accuracy among the models, with a cross-validation score of 0.997 and a test set score of 0.999

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages