Understanding and diagnosing your machine-learning modelsΒΆ
Achieving a good prediction is often only half of the job. Questions immediately arise: How to improve this prediction? What drives the prediction? Can we operate changes to the system based on the predictions? All these questions require understanding how good is the model prediction, and how do the model predict.
This tutorial assumes basic knowledge of scikit-learn. It will focus on statistics, tests, and interpretation rather than improving the prediction.
- 1. Measuring how well a model predicts
- 1.1. Metrics to judge the sucess of a model
- 1.2. Cross-validation: some gotchas
- 1.3. Underfit vs overfit: do I need more data, or more complex models?
- 2. Understanding why a classifier predicts
- 2.1. Interpreting linear models
- 2.2. Interpreting random forests
- 2.3. Partial dependency plots
- 2.4. Black-box interpretation of models: LIME
- 3. Appendix: auxiliary figures