Elements of Statistical Learning. Hastie, Tibshirani, Friedman
All of Statistics. Larry Wasserman
Machine Learning and Bayesian Reasoning. David Barber
Gaussian Processes for Machine Learning. Rasmussen and Williams
Information Theory, Inference, and Learning Algorithms. David MacKay
Introduction to Machine Learning. Smola and Vishwanathan
A Probabilistic Theory of Pattern Recognition. Devroye, Gyorfi, Lugosi.
Introduction to Information Retrieval. Manning, Rhagavan, Shutze
Forecasting: principles and practice. Hyndman, Athanasopoulos. (Online Book)
Introduction to statistical thought. Lavine
Basic Probability Theory. Robert Ash
Introduction to probability. Grinstead and Snell
Principle of Uncertainty. Kadane
Linear Algebra / Optimization
Linear Algebra, Theory, and Applications. Kuttler
Linear Algebra Done Wrong. Treil
Applied Numerical Computing. Vandenberghe
Applied Numerical Linear Algebra. James Demmel
Convex Optimization. Boyd and Vandenberghe
A Field Guide to Genetic Programming. Poli, Langdon, McPhee.
Evolved To Win. Sipper
Essentials of Metaheuristics. Luke