Characterizing Dutch Job Advertisements in Twitte

Group 06 (Aaqib Saeed, Muhammad Arif Wicaksana, Alexandru Serban)

In this project, we try to characterize Dutch job advertisement tweets in period 2014-2015. Questions we try to answer:

What kind of jobs are posted on Twitter?
How is the job trendings over the period?
Which area over the most jobs on Twitter?

Our screencast:

##Files:

geo-tagged.pig is for geo tagged tweets generated by twitter-geotagged.jar.
nongeo-tagged.pig will sample the tweets generated from map reduce job by twitter-nongeotagged.jar.
word-freq.pig will generate word counts/frequencies and the output will be used by R script.
The folder bigdata-0.2 contains MapReduce code.
folder parsed_job_tweets contains job tweets parsed from the datasets.

##Usage: ###MapReduce hadoop jar twitter.jar nl.utwente.bigdata.TwitterR <INPUT DIRECTORY> <OUTPUT DIRECTORY> ###Pig Latin

pig –x mapreduce wordfrequency.pig
pig –x mapreduce general.pig

###R

dataset <- read.delim("[FILE PATH]", header=FALSE, quote="", stringsAsFactors=FALSE)
dataset$V1 = tolower(dataset$V1)

wordcloud(words = dataset$V1, freq = dataset$V2, colors=brewer.pal(9, "Dark2"))

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Characterizing Dutch Job Advertisements in Twitte

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
bigdata-0.2		bigdata-0.2
parsed_job_tweets		parsed_job_tweets
README.md		README.md
geo-tagged.pig		geo-tagged.pig
nongeo-tagged.pig		nongeo-tagged.pig
twitter-geotagged.jar		twitter-geotagged.jar
twitter-nongeotagged.jar		twitter-nongeotagged.jar
word-freq.pig		word-freq.pig
wordcloud.r		wordcloud.r

ProjectBigdata/dutch-job-advertisement-in-twitter

Folders and files

Latest commit

History

Repository files navigation

Characterizing Dutch Job Advertisements in Twitte

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages