A Machine Learning based stock evaluator that will label stocks as a buy or sell based on certain fundametal signals.
First let me start of with a meme to sum up the entire project
- Python 3
- Scikit-learn
- matplotlib
- Pandas
- RegEx
- Quandl
- Yahoo Finance
- Backtesting strategy after a train and validation
- Comparing return to the market average
The code followed the framework for almost all machine learning processes.
-
Data Collection: the data was pulled using a scraper that would use regular expressions to strip data from yahoo html files, the amazing Quandl API helped fill in the rest of the data
-
Data Preparation: Using the SKLearn library on python, the data was randomized and split into training and evaluation sets. An additional script was used to fill in missing values
-
Model: Linear Support Vector Classification
-
Train: 0 for underperforming stocks, 1 for outperforming stocks. A simple function call from the SKLearn library
-
Evaluate: Compares the return of the stocks chosen to the return of the S&P 500.
-
Parameter Tuning: More of an artform, test size was manipulated, features were added and removed. Feature weighting was a little hard for my first Machine learning algorithm.
-
Predict and Test: Used a very basic backtester to measure the return if the algorithm was used. On average the predictions were 56% correct. This may not seem bad but it's unknown. The problem is if you are right 56% of the time, if the losses from the other 44% outweigh the gains made, then the algorithm was unsuccessful. Overall, the algo outperformed the S&P 500 by 9% which is actually pretty good (and probably unrealistic)!
- Automate data pull to update weekly
- Use techincal signals
- A more in-depth backtester
- Future week prediction