My introduction to Scala. Estimates PageRank values of Wikipedia articles using Wikipedia dump data.
Full details of this project can be found here: http://www.cs.colostate.edu/~cs435/CS435Fall18/PA3/PA3-Fall18-CS435.pdf.
The datasets used to run this project are linked in the above pdf (hyperlink) on page 5. Due to their size, I am unable to upload them to github. Of course, if you would like to run this project you will need to replace all hdfs hosts and port numbers to yours! As-is, all ip addresses were replaced with "montgomery". :)