Skip to content

Latest commit

 

History

History
38 lines (29 loc) · 1.72 KB

File metadata and controls

38 lines (29 loc) · 1.72 KB

DS-DA-Analysis-of-Bandung-PPDB-2022

This repository contains a data analysis of the PPDB (Penerimaan Peserta Didik Baru) in Bandung for the year 2022. The data is used to analyze the quotas, scores, and zoning of schools in Bandung, as well as the socio-economic index of each district.

Libraries Used

  1. numpy
  2. pandas
  3. matplotlib
  4. folium

Data

The data used in this analysis is sourced from several CSV files:

dataset/data_kuota.csv: contains information on the quotas for each school
dataset/data_rapor.csv: contains information on the scores of each student
dataset/data_zonasi.csv: contains information on the zoning of each student
dataset/data_koordinat.csv: contains information on the coordinates of each school
dataset/ik-berdasarkan-aspek-dan-kecamatan-2018.csv: contains information on the socio-economic index of each district

Data Cleansing

The data is cleaned and preprocessed before analysis. This includes:

  1. Replacing missing values with np.NaN
  2. Changing the data types of certain columns
  3. Renaming columns for consistency

Analysis

The cleaned data is then used to perform various analyses, such as:

  1. Analyzing the distribution of quotas among schools
  2. Examining the relationship between scores and school quotas
  3. Mapping the locations of schools and the socio-economic index of each district

Note

The code above is just the data preparation and data cleansing, there is no analysis being done here. The analysis would typically happen after this step and would involve using the data to answer specific questions or test hypotheses.

Analysis

To see the analysis, please visit the link below: https://drive.google.com/file/d/18Xhse8jhQA-Pc5om4MDbadIiHT7heFy_/view