Skip to content

Latest commit

 

History

History
40 lines (24 loc) · 1.21 KB

README.md

File metadata and controls

40 lines (24 loc) · 1.21 KB

#SalesPrediction

Predict sales by analyzing social network data.

##Install

The code is implemented using Python 3. You also need to install BeautifulSoup and Twython:

$ pip install beautifulsoup4

$ pip install twython

##Get Started

###Get data from twitter You need a text file that list all the account you want to fetch. A sample file sample_list.txt looks like:

id_1
id_2
id_3
...

Then use fetch_twitter.py to get data and save to file:

$ python fetch_twitter.py sample_list.txt

###Run the pipeline

After fetching data from twitter, you can run the pipeline with two arguments datapath and limit. For example:

$ python pipeline.py ./data 10

###Usage As most of the work is network I/O intensive, it's better to do parallel computing. We do use multiple processes when searching eaby. However, we don't have time to implement the main logic for each tweets in parallel. Instead, you can divide the twitter data into several sets, like ./data1, ./data2, ./data3, ... . Then you can open several terminals and process each set concurrently.