mupd_reports

Scraping, parsing and re-publishing University of Missouri Police Department Incident Reports

Intro

The University of Missour Police Department publishes data on its website about their cases. They're doing a great job keeping the data up-to-date, but there are a couple of problems:

The incident page has filter options to find specific kinds of incidents within a date range and/or at a specific address, which is nice. But some of the less common charges, like making a terrorist threat, aren't categorized under an incident type. Furthermore, not all cases originate from an incident report, so you won't even find those cases on this list.
The daily clery reports include every case and more information about each one, including the exact charges and the current disposition of the case. But, the daily reports are published as pdfs, which prevents any searching or analysis.

We can do better. Here's how:

Download the daily clery reports;
Extract the text from the pdf pages;
Parse that text into a database;
Build a web app for users to interact with this improved data.

Dependencies

Python 2.7 +: An interpreted, object-oriented, high-level programming language;
requests: For handling HTTP request;
html5lib: For parsing HTML the same way any major browser would;
beautifulsoup 4: For conveniently manipulating the parsed HTML.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
pdfs		pdfs
.gitignore		.gitignore
README.md		README.md
scrape_incident_logs.py		scrape_incident_logs.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mupd_reports

Intro

Dependencies

About

Releases

Packages

Contributors 2

Languages

gordonje/mupd_reports

Folders and files

Latest commit

History

Repository files navigation

mupd_reports

Intro

Dependencies

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages