This project is also a part of the WiDS Datathon 2023 - Sub-seasonal Temperature Forecasting, sponsored by Omdena Silicon Valley Chapter.
This GitHub repository houses the project by Omdena Silicon Valley Chapter for the WiDS Datathon 2023. The aim is to predict the 14-day average of maximum and minimum temperatures for various US locations, using historical weather data. This project supports community preparedness against extreme weather from climate change.
The dataset includes temperature, wind speed, and vapor pressure across multiple US locations. Each record represents a unique location and a two-week period, designed to challenge our predictive modeling techniques.
- Exploratory Data Analysis: Understand data distributions and relationships.
- Data Preprocessing: Clean, handle missing values, and detect outliers.
- Feature Engineering: Identify and select impactful features for the models.
- Model Development: Use machine learning methods like Random Forest, XGBoost, and CNNs.
- Model Evaluation: Compare model performances and select the best.
- Deployment Preparation: Ready the final model for real-time forecasting applications.
- Feature Selection: Identifying the most relevant features for accurate predictions.
- Data Merging and Preprocessing: Effectively combining and preparing datasets for modeling.
Participants will gain expertise in:
- Data Pre-processing: Techniques for cleaning and transforming data.
- Feature Engineering: Analyzing and selecting significant data features.
- Model Development: Building and tuning advanced predictive models.
- Model Evaluation: Applying metrics to evaluate model performance.
- Deployment: Deploying models in production environments.
This project is led by the Omdena Silicon Valley Chapter Lead Nishrin Kachwala, and Omdena Collaborators, comprising data scientists and AI practitioners focused on using machine learning for social good.
Our project contributes to and enhances community resilience against climate change.
- Have a Look at the project structure and folder overview below to understand where to store/upload your contribution
- If you're creating a task, Go to the task folder and create a new folder with the below naming convention and add a README.md with task details and goals to help other contributors understand
- Task Folder Naming Convention : task-n-taskname.(n is the task number) ex: task-1-data-analysis, task-2-model-deployment etc.
- Create a README.md with a table containing information table about all contributions for the task.
- If you're contributing for a task, please make sure to store in relavant location and update the README.md information table with your contribution details.
- Make sure your File names(jupyter notebooks, python files, data sheet file names etc) has proper naming to help others in easily identifing them.
- Please restrict yourself from creating unnessesary folders other than in 'tasks' folder (as above mentioned naming convention) to avoid confusion.
├── LICENSE
├── README.md <- The top-level README for developers/collaborators using this project.
├── original <- Original Source Code of the challenge hosted by omdena. Can be used as a reference code for the current project goal.
│
│
├── reports <- Folder containing the final reports/results of this project
│ └── README.md <- Details about final reports and analysis
│
│
├── src <- Source code folder for this project
│
├── data <- Datasets used and collected for this project
│
├── docs <- Folder for Task documentations, Meeting Presentations and task Workflow Documents and Diagrams.
│
├── references <- Data dictionaries, manuals, and all other explanatory references used
│
├── tasks <- Master folder for all individual task folders
│
├── visualizations <- Code and Visualization dashboards generated for the project
│
└── results <- Folder to store Final analysis and modelling results and code.
- Original - Folder Containing old/completed Omdena challenge code.
- Reports - Folder to store all Final Reports of this project
- Data - Folder to Store all the data collected and used for this project
- Docs - Folder for Task documentations, Meeting Presentations and task Workflow Documents and Diagrams.
- References - Folder to store any referneced code/research papers and other useful documents used for this project
- Tasks - Master folder for all tasks
- All Task Folder names should follow specific naming convention
- All Task folder names should be in chronologial order (from 1 to n)
- All Task folders should have a README.md file with task Details and task goals along with an info table containing all code/notebook files with their links and information
- Update the task-table whenever a task is created and explain the purpose and goals of the task to others.
- Visualization - Folder to store dashboards, analysis and visualization reports
- Results - Folder to store final analysis modelling results for the project.