Skip to content

ProjectBigdata/dutch-job-advertisement-in-twitter

Repository files navigation

Characterizing Dutch Job Advertisements in Twitte

Group 06 (Aaqib Saeed, Muhammad Arif Wicaksana, Alexandru Serban)

In this project, we try to characterize Dutch job advertisement tweets in period 2014-2015. Questions we try to answer:

  1. What kind of jobs are posted on Twitter?
  2. How is the job trendings over the period?
  3. Which area over the most jobs on Twitter?

Our screencast:

Screencast video

##Files:

  • geo-tagged.pig is for geo tagged tweets generated by twitter-geotagged.jar.
  • nongeo-tagged.pig will sample the tweets generated from map reduce job by twitter-nongeotagged.jar.
  • word-freq.pig will generate word counts/frequencies and the output will be used by R script.
  • The folder bigdata-0.2 contains MapReduce code.
  • folder parsed_job_tweets contains job tweets parsed from the datasets.

##Usage: ###MapReduce hadoop jar twitter.jar nl.utwente.bigdata.TwitterR <INPUT DIRECTORY> <OUTPUT DIRECTORY> ###Pig Latin

pig –x mapreduce wordfrequency.pig
pig –x mapreduce general.pig

###R

dataset <- read.delim("[FILE PATH]", header=FALSE, quote="", stringsAsFactors=FALSE)
dataset$V1 = tolower(dataset$V1)

wordcloud(words = dataset$V1, freq = dataset$V2, colors=brewer.pal(9, "Dark2"))

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages