Opinionated list of resources facilitating model interpretability (introspection, simplification, visualization, explanation).
- Interpretable models
- Simple decision trees
- Rules
- (Regularized) linear regression
- k-NN
- Interpretable classifiers using rules and Bayesian analysis: Building a better stroke prediction model by Benjamin Letham, Cynthia Rudin, Tyler H. McCormick, David Madigan
- Predictive learning via rule ensembles by Jerome H. Friedman, Bogdan E. Popescu
- Comprehensible classification models by Alex A. Freitas
- https://dx.doi.org/10.1145/2594473.2594475
- http://www.kdd.org/exploration_files/V15-01-01-Freitas.pdf
- Interesting discussion of interpretability for a few classification models (decision trees, classification rules, decision tables, nearest neighbors and Bayesian network classifier)
- Models offering feature importance measures
- Random forest
- Boosted trees
- Extremely randomized trees
- Extremely randomized trees by Pierre Geurts, Damien Ernst, Louis Wehenkel
- Random ferns
- rFerns: An Implementation of the Random Ferns Method for General-Purpose Machine Learning by Miron B. Kursa
- Linear regression (with a grain of salt)
- Model Class Reliance: Variable Importance Measures for any Machine Learning Model Class, from the “Rashomon” Perspective by Aaron Fisher, Cynthia Rudin, Francesca Dominici
- https://arxiv.org/pdf/1801.01489
- https://github.com/aaronjfisher/mcr
- Universal (model agnostic) variable importance measure
- Visualizing the Feature Importance for Black Box Models by Giuseppe Casalicchio, Christoph Molnar, Bernd Bischl
- https://arxiv.org/pdf/1804.06620
- https://github.com/giuseppec/featureImportance
- Global and local (model agnostic) variable importance measure (based on Model Reliance)
- Bias in random forest variable importance measures: Illustrations, sources and a solution by Carolin Strobl, Anne-Laure Boulesteix, Achim Zeileis, Torsten Hothorn
- Conditional Variable Importance for Random Forests by Carolin Strobl, Anne-Laure Boulesteix, Thomas Kneib, Thomas Augustin, Achim Zeileis
- Very good blog post describing deficiencies of random forest feature importance and the permutation importance
- Permutation importance - simple model agnostic approach is described in Eli5 documentation
- Classification of feature selection methods
- Filters
- Wrappers
- Embedded methods
- Filter Methods
- Relief-Based Feature Selection: Introduction and Review by Ryan J. Urbanowicz, Melissa Meeker, William LaCava, Randal S. Olson, Jason H. Moore
- Benchmarking Relief-Based Feature Selection Methods for Bioinformatics Data Mining by Ryan J. Urbanowicz, Randal S. Olson, Peter Schmitt, Melissa Meeker, Jason H. Moore
- On the Use of Variable Complementarity for Feature Selection in Cancer Classification by Patrick Meyer, Gianluca Bontempi
- https://dx.doi.org/10.1007/11732242_9
- https://pdfs.semanticscholar.org/d72f/f5063520ce4542d6d9b9e6a4f12aafab6091.pdf
- Introduces information theoretic methods - double input symmetrical relevance (DISR)
- Conditional Likelihood Maximisation: A Unifying Framework for Information Theoretic Feature Selection by Gavin Brown, Adam Pocock, Ming-Jie Zhao, Mikel Luján
- http://www.jmlr.org/papers/volume13/brown12a/brown12a.pdf
- Code: https://github.com/Craigacp/FEAST
- Discusses various approaches based on mutual information (MIM, mRMR, MIFS, CMIM, JMI, DISR, ICAP, CIFE, CMI)
- Feature selection via joint likelihood by Adam Pocock
- Wrapper methods
- Feature Selection with theBorutaPackage by Miron B. Kursa, Witold R. Rudnicki
- Boruta for those in a hurry
- General
- Special issue of JMLR of feature selection - oldish (2003)
- Result Analysis of the NIPS 2003 Feature Selection Challenge by Isabelle Guyon, Steve Gunn, Asa Ben-Hur, Gideon Dror
- Irrelevant Features and the Subset Selection Problem by George John, Ron Kohavi, Karl Pfleger
- https://pdfs.semanticscholar.org/a83b/ddb34618cc68f1014ca12eef7f537825d104.pdf
- Classic paper discussing weakly relevant features, irrelevant features, strongly relevant features
- Consistent Feature Selection for Pattern Recognition in Polynomial Time by Roland Nilsson, José Peña, Johan Björkegren, Jesper Tegnér
- http://www.jmlr.org/papers/volume8/nilsson07a/nilsson07a.pdf
- Discusses minimal optimal vs all-relevant approaches to feature selection
- Feature Engineering and Selection by Kuhn & Johnson
- Sligtly off-topic, but very interesting book
- http://www.feat.engineering/index.html
- https://bookdown.org/max/FES/
- https://github.com/topepo/FES
- Magnets by R. P. Feynman https://www.youtube.com/watch?v=wMFPe-DwULM
- To Explain or to Predict? by Galit Shmueli
- The Mythos of Model Interpretability by Zachary C. Lipton
- The Promise and Peril of Human Evaluation for Model Interpretability by Bernease Herman
- Towards A Rigorous Science of Interpretable Machine Learning by Finale Doshi-Velez, Been Kim
- The Book of Why: The New Science of Cause and Effect by Judea Pearl
- Looking Inside the Black Box, presentation of Leo Breiman
- Peeking Inside the Black Box: Visualizing Statistical Learning with Plots of Individual Conditional Expectation by Alex Goldstein, Adam Kapelner, Justin Bleich, Emil Pitkin
- “Why Should I Trust You?”: Explaining the Predictions of Any Classifier by Marco Tulio Ribeiro, Sameer Singh, Carlos Guestrin
- https://arxiv.org/pdf/1602.04938
- Code: https://github.com/marcotcr/lime
- https://github.com/marcotcr/lime-experiments
- https://www.youtube.com/watch?v=bCgEP2zuYxI
- Introduces the LIME method (Local Interpretable Model-agnostic Explanations)
- A Model Explanation System: Latest Updates and Extensions by Ryan Turner
- Understanding Black-box Predictions via Influence Functions by Pang Wei Koh, Percy Liang
- A Unified Approach to Interpreting Model Predictions by Scott Lundberg, Su-In Lee
- https://arxiv.org/pdf/1705.07874
- Code: https://github.com/slundberg/shap
- Introduces the SHAP method (SHapley Additive exPlanations), generalizing LIME
- Anchors: High-Precision Model-Agnostic Explanations by Marco Ribeiro, Sameer Singh, Carlos Guestrin
- Learning to Explain: An Information-Theoretic Perspective on Model Interpretation by Jianbo Chen, Le Song, Martin J. Wainwright, Michael I. Jordan
- Explanations of model predictions with live and breakDown packages by Mateusz Staniak, Przemyslaw Biecek
- A review book - Interpretable Machine Learning. A Guide for Making Black Box Models Explainable by Christoph Molnar
- Visualizing and Understanding Convolutional Networks by Matthew D Zeiler, Rob Fergus
- Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps by Karen Simonyan, Andrea Vedaldi, Andrew Zisserman
- Understanding Neural Networks Through Deep Visualization by Jason Yosinski, Jeff Clune, Anh Nguyen, Thomas Fuchs, Hod Lipson
- Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization by Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, Dhruv Batra
- Generating Visual Explanations by Lisa Anne Hendricks, Zeynep Akata, Marcus Rohrbach, Jeff Donahue, Bernt Schiele, Trevor Darrell
- Rationalizing Neural Predictions by Tao Lei, Regina Barzilay, Tommi Jaakkola
- Gradients of Counterfactuals by Mukund Sundararajan, Ankur Taly, Qiqi Yan
- Pixel entropy can be used to detect relevant picture regions (for CovNets)
- See Visualization section and Fig. 5 of the paper
- High-Resolution Breast Cancer Screening with Multi-View Deep Convolutional Neural Networks by Krzysztof J. Geras, Stacey Wolfson, Yiqiu Shen, Nan Wu, S. Gene Kim, Eric Kim, Laura Heacock, Ujas Parikh, Linda Moy, Kyunghyun Cho
- See Visualization section and Fig. 5 of the paper
- SVCCA: Singular Vector Canonical Correlation Analysis for Deep Learning Dynamics and Interpretability by Maithra Raghu, Justin Gilmer, Jason Yosinski, Jascha Sohl-Dickstein
- Visual Explanation by Interpretation: Improving Visual Feedback Capabilities of Deep Neural Networks by Jose Oramas, Kaili Wang, Tinne Tuytelaars
- Axiomatic Attribution for Deep Networks by Mukund Sundararajan, Ankur Taly, Qiqi Yan
- https://arxiv.org/pdf/1703.01365
- Code: https://github.com/ankurtaly/Integrated-Gradients
- Proposes Integrated Gradients Method
- See also: Gradients of Counterfactuals https://arxiv.org/pdf/1611.02639.pdf
- Learning Important Features Through Propagating Activation Differences by Avanti Shrikumar, Peyton Greenside, Anshul Kundaje
- https://arxiv.org/pdf/1704.02685
- Proposes Deep Lift method
- Code: https://github.com/kundajelab/deeplift
- Videos: https://www.youtube.com/playlist?list=PLJLjQOkqSRTP3cLB2cOOi_bQFw6KPGKML
- The (Un)reliability of saliency methods by Pieter-Jan Kindermans, Sara Hooker, Julius Adebayo, Maximilian Alber, Kristof T. Schütt, Sven Dähne, Dumitru Erhan, Been Kim
- https://arxiv.org/pdf/1711.0867
- Review of failures for methods extracting most important pixels for prediction
- Classifier-agnostic saliency map extraction by Konrad Zolna, Krzysztof J. Geras, Kyunghyun Cho
- Classifier-agnostic Saliency Map Extraction
- The Building Blocks of Interpretability
- https://distill.pub/2018/building-blocks
- Has some embeded links to notebooks
- Uses Lucid library https://github.com/tensorflow/lucid
- Extracting Automata from Recurrent Neural Networks Using Queries and Counterexamples by Gail Weiss, Yoav Goldberg, Eran Yahav
- Distilling a Neural Network Into a Soft Decision Tree by Nicholas Frosst, Geoffrey Hinton
- Visualizing Statistical Models: Removing the blindfold
- Partial dependence plots
- http://scikit-learn.org/stable/auto_examples/ensemble/plot_partial_dependence.html
- pdp: An R Package for Constructing Partial Dependence Plots https://journal.r-project.org/archive/2017/RJ-2017-016/RJ-2017-016.pdf https://cran.r-project.org/web/packages/pdp/index.html
- ggfortify: Unified Interface to Visualize Statistical Results of Popular R Packages
- RandomForestExplainer
- ggRandomForest
- Tutorial on Interpretable machine learning at ICML 2017
- P. Biecek, Show Me Your Model tools for visualisation of statistical models
- S. Ritchie, Just-So Stories of AI
- C. Jarmul, Towards Interpretable Accountable Models
- I. Oszvald, Machine Learning Libraries You’d Wish You’d Known About
- A large part of the talk covers model explanation and visualization
- Video: https://www.youtube.com/watch?v=nDF7_8FOhpI
- Associated notebook on explaining regression predictions: https://github.com/ianozsvald/data_science_delivered/blob/master/ml_explain_regression_prediction.ipynb
- G. Varoquaux, Understanding and diagnosing your machine-learning models (covers PDP and Lime among others)
- Interpretable ML Symposium (NIPS 2017) (contains links to papers, slides and videos)
- http://interpretable.ml/
- Debate, Interpretability is necessary in machine learning
- Workshop on Human Interpretability in Machine Learning (WHI), organised in conjunction with ICML
- 2018 (contains links to papers and slides)
- 2017 (contains links to papers and slides)
- 2016 (contains links to papers)
- Analyzing and interpreting neural networks for NLP (BlackboxNLP), organised in conjunction with EMNLP 2018
- FAT/ML Fairness, Accountability, and Transparency in Machine Learning https://www.fatml.org/
Software related to papers is mentioned along with each publication. Here only standalone software is included.
- DALEX - Descriptive mAchine Learning EXplanations
- ELI5 - Python package dedicated to debugging machine learning classifiers and explaining their predictions
- forestmodel - R package visualizing coefficients of different models with the so called forest plot
- fscaret - Automated Feature Selection from ‘caret’
- iml - An R package for Interpretable Machine Learning
- lime - R package implementing LIME
- Lucid - a collection of infrastructure and tools for research in neural network interpretability
- praznik - a collection of feature selection filters performing greedy optimisation of mutual information-based usefulness criteria, see JMLR 13, 27−66 (2012)
- yellowbrick - visual analysis and diagnostic tools to facilitate machine learning model selection