EuroSciPy 2015 | Presentation: How “good” is your model, and how can you make it better?

Wednesday 2 p.m.–3:30 p.m.

How “good” is your model, and how can you make it better?

Chih-Chun Chen, Dimitry Foures, Elena Chatzimichali, Giuseppe Vettigli

Audience level:: Intermediate

Description

What distinguishes “true artists” from “one-hit wonders” in machine learning is an understanding of how a model performs with respect to different data. This hands-on tutorial will show you how to use scikit-learn’s model evaluation functions to evaluate different models in terms of accuracy and generalisability, and search for optimal parameter configurations.

Abstract

The objective of this tutorial is to give participants the skills required to validate, evaluate and fine-tune models using scikit-learn’s evaluation metrics and parameter search capabilities. It will combine both the theoretical rationale behind these methods and their code implementation.

The session will be structured as follows (rough timings in parentheses): 1. Explanation of over-fitting and the bias-variance trade-off, followed by a brief conceptual overview of cross-validation and bootstrapping validation methods, in particular with respect to bias and variance. Pointers to the corresponding scikit-learn functions will also be given. (20 minutes) 2. Implementation of cross-validation and grid-search method for parameter tuning, using KNN classification as an illustrative example. Participants will train two KNN neighbours with different numbers of neighbours on preprocessed data (provided). They will then be guided through cross-validation, plotting of results, and grid-search to find the best neighbour and weight configuration(s). (30 minutes) 3. Comparison of different classification models using cross-validation. Participants will implement a logistic regression, support vector machine (SVM), Random Forest or neural network model and apply the same cross-validation and grid search method as in the guided KNN example. (50 minutes) 4. Participants will compare their plots and discuss which model they might choose for different objectives, trading off generalisability, accuracy, speed and randomness. (20 minutes)