Finding Arbitrage in NBA Betting

The goal of this project is to produce a model that can predict NBA scoring performances more accurately than the consensus line being offered. A secondary objective is to identify the features with the most predictive power.

To start exploring this goal, I looked at the correlation between our target variable (result), and the features present in the baseline data.

The results have some promising features, such as Projection and Line.

Next, I one hot encoded all of the categorical features in our betting dataset (except name), and fit a basic regression model (R2 = .45 -- not bad for baseline!)

And looked for any standout correlations in the subsequent heatmap

Next steps will be the followig:

Deepen the scrape function for a more complete dataset
Join the team and player data, engineer a number of rolling average and trend-style features on that dataset.
Merge engineered dataset with the betting data, clean final data, fit a LassoCV model, pick the top feautres, and re-run with OLS Regression model.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MVP.md

MVP.md

Finding Arbitrage in NBA Betting

Files

MVP.md

Latest commit

History

MVP.md

File metadata and controls

Finding Arbitrage in NBA Betting