In this project, I used sqoop to import crime data happened in 2016 from MySql to HDFS. The file import.sh contains the sqoop code. (I have changed the real username and password of the server for safety.)
pig_script.pig is the file of Pig Latin code. Before analyzing, I extracted the street name from address column and converted DateTime to month, day, week and daypart. After that, I created three reports which show:
-
- The street has the highest total count of crimes for each month;
-
- The street has the highest number of the unique crime for each month;
-
- The crime with the most MALE victims for each month.