#SalesPrediction
Predict sales by analyzing social network data.
##Install
The code is implemented using Python 3. You also need to install BeautifulSoup and Twython:
$ pip install beautifulsoup4
$ pip install twython
##Get Started
###Get data from twitter
You need a text file that list all the account you want to fetch. A sample file sample_list.txt
looks like:
id_1
id_2
id_3
...
Then use fetch_twitter.py
to get data and save to file:
$ python fetch_twitter.py sample_list.txt
###Run the pipeline
After fetching data from twitter, you can run the pipeline with two arguments datapath
and limit
. For example:
$ python pipeline.py ./data 10
###Usage
As most of the work is network I/O intensive, it's better to do parallel computing. We do use multiple processes when searching eaby. However, we don't have time to implement the main logic for each tweets in parallel. Instead, you can divide the twitter data into several sets, like ./data1
, ./data2
, ./data3
, ... . Then you can open several terminals and process each set concurrently.