Graduate Predictive Analytic Course
For the final project each student had to select data to explore. This dataset comes from the City of Chicago’s Data Portal and contains information about crime in Chicago in 2018. We reduced the size of the dataset by sampling 15,000 observations from the original dataset, which contained 267,000 observations. Models were built to predict the likelihood that a person would be Arrested.
We used various models including Classification tree, Logistic regression, Stacking, Random forest, K-fold regression, and Regression with stepwise (Backwards and Forward). Three were discussed in this presentation : Classification, Logistic regression, and Stacking.