Diamond Price Prediction using Random Search, SKlearn Pipelines, and Random Forest/XGBoost with Standard Scaler
This Jupyter Notebook demonstrates a machine learning pipeline for predicting the prices of diamonds using various techniques including Random Search, SKlearn Pipelines, and Random Forest/XGBoost models with Standard Scaler preprocessing. The dataset used contains features such as carat weight, cut, color, clarity, etc., which are commonly associated with diamond #.
- Python 3.12
- Jupyter Notebook
- Libraries:
- numpy
- pandas
- sklearn
- xgboost
The dataset used in this project contains information about various diamonds, including their characteristics and prices. The dataset is not included in this repository due to size limitations, but it can be obtained from source. Make sure to place the dataset file in the same directory as the notebook.
To run the notebook:
- Clone the repository
- Install requirements
- Run the notebook
- The dataset used in this project is sourced from source.
- This project draws inspiration from various tutorials and documentation available online for SKlearn, XGBoost, aand Kaggle.