Skip to content

This project focuses on analyzing COVID-19 data to uncover insights related to cases, deaths, vaccinations, and global trends.

Notifications You must be signed in to change notification settings

anshulkansal121/COVID_Data_Analysis_SQL

Repository files navigation

COVID-19 Data Analysis Project

This project focuses on analyzing COVID-19 data to uncover insights related to cases, deaths, vaccinations, and global trends. The dataset is sourced from Our World in Data and has been structured into two main tables within an MS SQL Server database: CovidDeaths and CovidVaccinations.

Objectives

The project aims to:

  1. Analyze the spread and impact of COVID-19 globally and regionally.
  2. Explore the relationship between vaccination rates, testing rates, and case/death trends.
  3. Provide actionable insights into public health strategies and vaccination rollouts.

Dataset Structure

The dataset is divided into two tables:

covid_deaths

Attributes include:

  • location: Country or region name.
  • continent: Continent name.
  • date: Date of the record.
  • total_cases: Cumulative number of confirmed cases.
  • new_cases: Daily new confirmed cases.
  • total_deaths: Cumulative number of deaths.
  • new_deaths: Daily new deaths.
  • population: Population of the country/region.
  • gdp_per_capita: GDP per capita for the country/region.
  • Other metrics such as new_cases_smoothed, new_deaths_smoothed, and per-million values for cases and deaths.

covid_vaccinations

Attributes include:

  • location: Country or region name.
  • date: Date of the record.
  • people_vaccinated: Total number of people who have received at least one vaccine dose.
  • people_fully_vaccinated: Total number of people fully vaccinated.
  • total_boosters: Total number of booster doses administered.
  • new_vaccinations: Daily new vaccinations.
  • gdp_per_capita: GDP per capita for the country/region.
  • Other metrics such as vaccination rates and per-million values.

Key Questions Explored

  1. What percentage of the population was infected in each country?
  2. Which country recorded the highest daily new cases and deaths, and on which dates?
  3. What are the average cases and deaths per million population across continents?
  4. Which countries had the highest and lowest case fatality rates (CFR)?
  5. Which countries experienced sustained high deaths per million over time?
  6. Which countries have the highest percentage of fully vaccinated population?
  7. Which countries achieved 10% to 50% vaccination coverage in the shortest time?
  8. Is there a correlation between vaccination rates and testing rates?
  9. How do vaccination rates compare across countries with different GDP per capita?
  10. Did countries with high vaccination rates experience a reduction in new cases per million?

Tools and Technologies

  • Database: MS SQL Server for data storage and querying.
  • Languages: SQL for data extraction & data analysis
  • Version Control: Git and GitHub for collaboration.

How to Use

Prerequisites

  1. MS SQL Server installed and running.
  2. Datasets from Our World in Data, available here.

Steps

  1. Clone this repository:
    git clone https://github.com/yourusername/covid-data-analysis.git
  2. Import the datasets through Import Data Wizard into MS SQL Server.
  3. Run SQL scripts provided in the sql_queries folder to extract data.

Results and Insights

The project provides insights into:

  • Global and regional trends in COVID-19 cases and deaths.
  • Impact of vaccination rollouts on reducing cases and deaths.
  • Continent and Countries vise fatality rate and recovery rate.

Contributing

We welcome contributions to improve this project. To contribute:

  1. Fork the repository.
  2. Commit your changes with detailed messages.
    git commit -m "Add detailed description of your changes"
  3. Push your changes to the branch.
    git push origin feature-name
  4. Submit a pull request (PR).

PR Guidelines

  • Ensure your PR answers at least one of the questions listed in the Key Questions Explored section or from this PDF.
  • Include a detailed description of the changes.
  • Reference relevant questions or objectives from the project.
  • Ensure your code follows best practices and is well-commented.

Future Work

  • Explore more trends in the data.
  • Enhance visualizations for interactive dashboards using Power BI or Tableau.

Acknowledgments

Special thanks to Our World in Data for providing comprehensive COVID-19 datasets & Alex the Analyst for such a interactive video on Data Analysis.

About

This project focuses on analyzing COVID-19 data to uncover insights related to cases, deaths, vaccinations, and global trends.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published