Investigation in Python
of predicting the virality of a news article based on its meta-data. Uses the UCI dataset from the paper below. A top accuracy of 67% is achieved via a Neural Network which is a competitive result (original paper baseline: 67%, current state-of-the-art: 69%).
K. Fernandes, P. Vinagre and P. Cortez. A Proactive Intelligent Decision Support System for Predicting the Popularity of Online News. Proceedings of the 17th EPIA 2015 - Portuguese Conference on Artificial Intelligence, September, Coimbra, Portugal.
The full report can be viewed here.
Keras
Numpy
Pandas
Scikit Learn
You can install the requirements via:
pip install -r requirements.txt
For faster training of the Neural Net, I recommend using a GPU. Create a new Anaconda
environment and install all the requirements except for Keras. Then run:
conda install -c anaconda keras-gpu
If you need help getting started with Anaconda
: Getting Started With Anaconda
Navigate into the Code
directory.
To run the full grid search, run:
python main.py
Correlation matrix computed via Pearson's coefficient is shown below.
Below is a table summary of the results. Bolded values represented best-in-class results for the experiments provided.
Below is a table summary of the results. Bolded values represented best-in-class results for the experiments provided.