Skip to content

Detecting Machine Failure from IoT Sensors with a SQL Pipeline.

Notifications You must be signed in to change notification settings

hypergalois/MachineFailurePrediction

Repository files navigation

Machine Failure Detection with PCA and Classification 🚀

1. Exploratory Data Analysis (EDA) and Data Preprocessing

SQLITE

  • Handling missing values: Imputation techniques were used to replace missing values in key features.
  • Encoding categorical variables: One-Hot Encoding was applied to categorical variables.
  • Normalization and scaling: PCA (Principal Component Analysis) was used for dimensionality reduction and feature scaling.

Correlation

SMOTE

2. Model Selection Using Multiple Classifiers (e.g., Logistic Regression, RandomForest, GradientBoosting, XGBoost)

  • Logistic Regression: Used for classifying machine status.
  • Random Forest Classifier: Utilizes multiple trees to handle complex feature relationships.
  • Gradient Boosting Classifier: Boosting model to improve accuracy.
  • XGBoost Classifier: An advanced boosting model optimized for speed and efficiency.

PCA was used to reduce feature dimensionality and improve model performance.

Explained Variance

Model screenshot

3. Metrics Calculation for Each Model

  • Precision: The proportion of correctly predicted positive cases.
  • Recall: The proportion of true positives over the total actual positive cases.
  • F1 Score: The harmonic mean of precision and recall.
  • Accuracy: The proportion of correct predictions overall.
  • Confusion Matrix: To visualize true positives, false positives, etc.

Metrics screenshot

About

Detecting Machine Failure from IoT Sensors with a SQL Pipeline.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published