Predictive maintenance is crucial for businesses that rely on machinery. It helps prevent unanticipated breakdowns by monitoring machine status and predicting failures before they occur. This project focuses on analyzing a manufacturing process where machine failures can happen due to various reasons. The goal is to identify contributing factors and predict failures accurately.
This project aims to answer the following key questions through data analytics:
- What factors contribute to machine failure?
- How can we predict when a machine is likely to fail?
- Which failure mode is the most likely to occur?
The dataset used in this project is provided by Stephan Matzka, School of Engineering - Technology and Life, Hochschule fΓΌr Technik und Wirtschaft Berlin. It is publicly available at: AI4I 2020 Predictive Maintenance Dataset
The dataset consists of 10,000 records with 14 features, including:
- UID: Unique identifier (1 to 10,000)
- Product ID: Quality variant (L = Low, M = Medium, H = High) with a serial number
- Air Temperature [K]: Normalized with a standard deviation of 2 K around 300 K
- Process Temperature [K]: Normalized with a standard deviation of 1 K, added to air temperature plus 10 K
- Rotational Speed [rpm]: Derived from power and overlaid with noise
- Torque [Nm]: Normally distributed around 40 Nm (std dev = 10 Nm)
- Tool Wear [min]: Accumulates wear based on product quality (H/M/L = 5/3/2 minutes per process)
- Machine Failure Label: Binary indicator (1 = failure, 0 = normal operation)
Machine failure consists of five independent failure types:
- Tool Wear Failure (TWF): Occurs between 200-240 mins of tool usage (120 instances in dataset)
- Heat Dissipation Failure (HDF): Happens when the air-to-process temperature difference is below 8.6 K and rotational speed is below 1380 rpm (115 instances)
- Power Failure (PWF): Occurs when power output (torque * speed) is below 3500 W or above 9000 W (95 instances)
- Overstrain Failure (OSF): If tool wear * torque exceeds predefined limits based on product quality (98 instances)
- Random Failures (RNF): Each process has a 0.1% failure probability, occurring randomly (5 instances)
A machine failure is recorded if at least one of these conditions is met.
The following steps outline the project workflow:
- Data Collection β Obtain and organize raw dataset
- Data Exploration & Processing β Clean, visualize, and understand dataset characteristics
- Dimensionality Reduction & Feature Selection β Select important features for modeling
- Exploratory Data Analysis (EDA) β Identify trends and correlations in data
- Model Selection & Training β Compare multiple machine learning models for failure prediction
- Model Performance Evaluation β Assess accuracy, precision, recall, and F1-score
- Performance Visualization β Plot results to interpret model effectiveness
This project aims to leverage machine learning to enhance predictive maintenance, reducing unexpected breakdowns and improving machine efficiency. By analyzing historical data, we can gain insights into failure patterns and take proactive measures to prevent downtime.
This project is licensed under the MIT License. See the LICENSE file for details.
The dataset used in this project is publicly available under its respective license at the UCI Machine Learning Repository. Please refer to the dataset source for licensing details.
- tejashwini-vemavarapu
For any inquiries or contributions, feel free to reach out.
- Clone the repository:
git clone https://github.com/your-username/Predictive-Machinery-Analysis-ML.git cd Predictive-Machinery-Analysis-ML
- Install dependencies:
pip install -r requirements.txt
- Run Jupyter Notebook:
jupyter notebook
- Open and explore the notebooks for data analysis and model building.