Housing Prices Prediction Project

This project predicts housing prices based on various features such as median income, house age, and average number of rooms. It leverages machine learning techniques with a Linear Regression model and provides insights through evaluation metrics.

Overview

The Housing Prices Prediction Project applies machine learning to analyze the California Housing Dataset and predict housing prices. It preprocesses data, trains a model, evaluates performance, and saves the model for future predictions.

Features

Data Preprocessing:
- Standardizes numerical features using StandardScaler for consistent model training.
Model Training:
- A pipeline integrates preprocessing with a Linear Regression model.
Model Evaluation:
- Calculates Mean Squared Error (MSE) to assess model accuracy.
Model Persistence:
- Saves the trained model with joblib for reuse.
Prediction:
- Accepts new input data and predicts housing prices.

Technologies Used

Python (Core Language)
NumPy and pandas (Data Handling)
scikit-learn (Modeling, Preprocessing, Evaluation)
joblib (Model Serialization)

How It Works

Dataset:
- The California Housing Dataset is loaded using fetch_california_housing.
- Features and target values (median house prices) are extracted.
Data Splitting:
- The dataset is split into training (80%) and testing (20%) subsets.
Preprocessing:
- Numerical features are standardized using StandardScaler within a ColumnTransformer.
Model Training:
- A Linear Regression model is trained using the preprocessed training data.
Evaluation:
- The model predicts prices for the test set, and the Mean Squared Error (MSE) is computed.
Saving and Loading the Model:
- The trained model is saved as housing_prices_model.joblib for future predictions.
- The saved model is reloaded for predicting prices for new data.

Setup and Installation

Clone the Repository:

git clone https://github.com/your-repo/housing-prices-prediction.git
cd housing-prices-prediction

Install Dependencies: Ensure Python 3.6+ is installed, then install the required libraries:
```
pip install numpy pandas scikit-learn joblib
```
Run the Script: Execute the Python script:
```
python main.py
```

Usage

Run the Script:

Train the model, evaluate its performance, and save it for reuse.

Predict New Prices:

Modify the new_house DataFrame in the script with the desired input features. Load the saved model and make predictions for the new house.

Output

Mean Squared Error: Evaluates model accuracy on test data.

Mean Squared Error on Test Data: 0.47
Predicted Price for the New House: $237500.00

Future Improvements

Experiment with advanced models such as Random Forest or Gradient Boosting.

Conduct hyperparameter tuning to optimize the model.

Implement feature engineering to improve accuracy.

Add support for categorical and text features using methods like CountVectorizer.

Acknowledgments

scikit-learn: For providing the dataset and ML tools.

joblib: For efficient model persistence.

NumPy and pandas: For data manipulation.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md
house-prices-prediction.py		house-prices-prediction.py
housing_prices_model.joblib		housing_prices_model.joblib

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Housing Prices Prediction Project

Table of Contents

Overview

Features

Technologies Used

How It Works

Setup and Installation

Usage

Output

Future Improvements

Acknowledgments

About

Releases

Packages

Languages

siddhinarayan09/house-prices-prediction

Folders and files

Latest commit

History

Repository files navigation

Housing Prices Prediction Project

Table of Contents

Overview

Features

Technologies Used

How It Works

Setup and Installation

Usage

Output

Future Improvements

Acknowledgments

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages