Urban Heat Island Index (UHI) prediction
The objective of this project is to develop a predictive model to estimate the Urban Heat Island Index (UHI) in urban areas, using meteorological and environmental data. Furthermore, the model is designed to identify and highlight key factors that significantly contribute to the development of UHI hotspots.
The goal of the project is to develop a machine learning model to predict heat island hotspots in an urban location. Furthermore, the model should be designed to discern and highlight the key factors that significantly contribute to the development of these hotspots within urban environments.
Participants will be provided with near-surface air temperature data in an index format, which was collected on July 24, 2021 using a ground traverse in the Bronx and Manhattan region of New York City. This dataset constitutes the traverse points (latitude and longitude) and their corresponding UHI (Urban Heat Island) index values. Participants will use this dataset to build a regression model to predict UHI index values for a given set of locations.
It is important to understand that the UHI Index at any given location is indicative of the relative temperature difference at that specific point compared to the average temperature of the city. This index serves as a crucial metric for assessing heat intensity within different urban areas.
Provided by the New York State Mesonet.
Includes measurements of temperature, relative humidity, wind speed, wind direction, and solar flux.
Two locations: Bronx and Manhattan.
Ground Track Data: Temperature measurements taken at different points in the city.
Date and time of measurement.
Surface air temperature [degrees C]: Air temperature near the surface.
Relative humidity [percent]: Relative humidity.
Average wind speed [m/s]: Average wind speed.
Wind direction [degrees]: Wind direction.
Solar flux [W/m^2]: Solar flux at the surface.
Bronx and Manhattan.
Converting the Date/Time column to datetime type.
Combining the data from both locations.
Calculating the UHI as the temperature difference between the Bronx and Manhattan.
Calculating the Heat Index for both locations.
Feature Selection:
Relevant features for the model were selected, such as temperature, humidity, wind speed, solar flux, etc.
A decision tree-based machine learning model, suitable for regression problems.
The data was split into training (80%) and test (20%) sets.
The model was trained using the training set.
The performance of the model was evaluated using metrics such as MSE (Mean Square Error) and R² (Coefficient of Determination).
Model Results MSE: 0.0528
R²: 0.9675
These results indicate that the model is highly predictive and explains approximately 96.75% of the variability in the data.
A simple interface was developed for users to enter data and obtain UHI predictions.
Input fields for meteorological characteristics.
Button to make the prediction.
Display of the prediction and the entered data.
CemillaX
This project aims to predict the Urban Heat Island Index (UHI) using meteorological and environmental data.
CemillaX/ │ ├── data/ # Folder to store data │ ├── raw/ # Raw (unprocessed) data │ ├── processed/ # Processed data │ └── external/ # External data (if any) │ ├── models/ # Trained models │ └── uhi_prediction_model.pkl # Saved model │ ├── notebooks/ # Jupyter/Colab notebooks │ ├── 01_data_exploration.ipynb │ ├── 02_model_training.ipynb │ └── 03_model_evaluation.ipynb │ ├── src/ # Source code (scripts) │ ├── data_processing.py # Script to process data │ ├── train_model.py # Script to train the model │ ├── evaluate_model.py # Script to evaluate the model │ └── predict.py # Script to make predictions │ ├── tests/ # Unit tests │ ├── test_data_processing.py │ ├── test_model.py │ └── init.py │ ├── app/ # Application (Streamlit, Flask, etc.) │ ├── app.py # Main application script │ ├── templates/ # HTML templates (if it's a web API) │ └── static/ # Static files (CSS, JS, images) │ ├── docs/ # Project documentation │ ├── README.md # Project overview │ ├── requirements.txt # Project dependencies │ └── images/ # Images for documentation │ ├── .gitignore # Files and folders ignored by Git ├── LICENSE # Project license └── README.md # Main README file
Stores the raw data (e.g. the original Excel file NY_Mesonet_Weather_New_Data.xlsx).
Stores the processed data (e.g. the NY_Mesonet_Weather_Processed.xlsx file).
External data that is not part of the main dataset (optional).
Stores the trained models (e.g. cemillax_uhi_prediction_model.pkl).
Contains the Jupyter or Google Colab notebooks used to explore data, train models, and evaluate results.
Exploratory data analysis (EDA).
Model training.
Model evaluation and results visualization.
Contains Python scripts to automate tasks:
Processes raw data and prepares it for the model.
Trains the model and saves it to the models/ folder.
Evaluates the model and generates metrics.
Makes predictions using the trained model.
Contains unit tests to validate the code:
Tests for the data processing script.
Tests for the trained model.
Contains the application to interact with the model:
Main script of the application (for example, an API with Flask or an interface with Streamlit).
HTML templates if the application is a web API.
Static files (CSS, JS, images) if the application is a web API.
Contains the project documentation:
Project overview.
List of project dependencies.
Images used in the documentation.
Specifies files and folders that Git should ignore (for example, virtual environment files or sensitive data).
Project license (for example, MIT, Apache, etc.).
Main README file describing the project.