Skip to content

Analysis of public reviews of housing providers in Manchester, using Natural Language Processing AI

License

Notifications You must be signed in to change notification settings

MSSAIStudyGroup/SocialHousing

Repository files navigation

Trustpilot Housing Reviews Analysis

License: CC0-1.0 Python Version

Table of Contents

Overview

The Trustpilot Housing Reviews Analysis project aims to extract and analyze Trustpilot reviews for various housing providers. The project consists of two main components:

  1. Data Extraction: Automates the retrieval of Trustpilot reviews for specified housing providers.
  2. Classification: Processes and classifies the extracted reviews to identify key issues and sentiment trends.

This analysis helps in understanding tenant satisfaction, common complaints, and areas requiring improvement for housing providers.

  • Funding This project was funded by a Campion Grant awarded by Manchester Statistical Society. See https://manstatsoc.org/ for more information.
  • Report An external link to the report accompanying this project can be found here: MSS REPORT LINK WHEN PUBLISHED.

Features

  • Automated Data Extraction: Scrapes Trustpilot reviews for selected housing providers.
  • Data Cleaning and Preprocessing: Cleans the extracted data for accurate analysis.
  • Text Classification: Categorizes reviews into predefined categories (e.g., Maintenance, Customer Service).
  • Reporting: Generates summary reports and visualizations of findings.

Project Structure

SocialHousing/
│
├── Housing Association Review Classification and Theme Visualization.ipynb  #  Notebook for classifying reviews and visualizing themes in housing association data.
├── Keyword Analysis 1 Star Reviews.ipynb  #  Notebook for analyzing keywords within 1 star housing association reviews.
├── LICENSE  #  Project license file.
├── README.md  #  Repository overview, setup instructions, and usage guidelines.
├── Themes2D.xlsx  #  Excel file containing theme data, with keywords for HACT UK Data Standards classes, for visualization and further analysis.
├── Trustpilot Review Single Page Extractor.ipynb  #  Notebook for scraping reviews from a single Trustpilot page.
└── Trustpilot Review Extraction Compilation.ipynb  #  Notebook for systematically extracting across multiple Trustpilot pages.

Installation

Prerequisites

  • Python 3.8+: Ensure you have Python installed. You can download it here.

Clone the Repository

git clone https://github.com/yourusername/trustpilot-housing-reviews-analysis.git
cd trustpilot-housing-reviews-analysis

Usage

Data Extraction

The data extraction component scrapes Trustpilot for reviews related to specified housing providers.

Configure Housing Providers

Edit the extraction.py file to specify the housing providers you want to analyze.

# Example
housing_providers = [
    "msvhousing",
    "clarionhousing",
    "onehousing",
    "onward",
    "yourhousinggroup",
    "jigsawhomes",
    "placesforpeople",
    "guinnesspartnership"
]

Run the Data Extraction Tool

You can run the data extraction tool using the provided script or via a Jupyter notebook.

Using Jupyter Notebook:

Open Trustpilot Review Extraction Compilation.ipynb and run the cells sequentially.

Classification

The classification component processes the extracted reviews and categorizes them based on predefined criteria.

Using Jupyter Notebook:

Open Housing Association Review Classification and Theme Visualization.ipynb and run the cells sequentially.

Dependencies

Required Python packages are:

  • Requests: HTTP library for web scraping.
  • BeautifulSoup4: Web scraping.
  • pandas: Data manipulation and analysis.
  • scikit-learn: Machine learning for classification.
  • Matplotlib: Data visualization.
  • Seaborn: Data visualization.
  • Jupyter Notebook: Interactive development.

Contributing

Contributions are welcome! Please follow these steps:

  1. Fork the Repository

  2. Create a Feature Branch

    git checkout -b feature/YourFeature
  3. Commit Your Changes

    git commit -m "Add some feature"
  4. Push to the Branch

    git push origin feature/YourFeature
  5. Open a Pull Request

License

This project is licensed under the CC0-1.0 (LICENSE).

Contact

For any questions or suggestions, please open an issue or contact guy@fuza.co.uk.


Disclaimer: This project is not affiliated with Trustpilot or any of the housing providers mentioned. It is intended for educational and analytical purposes only.

About

Analysis of public reviews of housing providers in Manchester, using Natural Language Processing AI

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published