tomMoral/24-sacl-ai-4-sciences

Training material for the Sacl-AI 4 sciences workshop

Jupyter Notebook

24-sacl-ai-4-sciences

Training material for the Sacl-AI 4 sciences workshop

Syllabus

Session 1 - O. Colliot - Introduction to ML with a focus on validation

Goal: Introduce the basics of ML and describe in details how to perform validation

History and terminology
Problem setup for ML basics (Model, loss, learning procedure, features)
Generalization in ML (overfitting, underfitting and model selection)
Validation (performance metrics, validation strategies, statistical analysis)

Session 2 - G. Lemaitre - The scikit-learn API

Goal: Introduce the scikit-learn API, with a focus on practical insights on the model validation and selection.

Overview of a simple cross-validation scheme k-fold
Overview of metrics (Regression, Classification)
Model selection through SearchCV
Cross validation in complex settings (stratification, groups, non-iid data)

Session 3 - T. Moreau - Learning with non-tabular data

Goal: Introduce the different types of data, with a focus on time-series, and the different methodologies to apply on each type.

Overview of the different types of data: tabular data, time series, images, graph, signals.
Overview of the specific problems and jargon with time series and signals.
How to get back to a "classical" ML framework?
Practical illustrations with time series.

Session 4 - R. Menegaux - Intro to deep learning

Goal: Describe the main types of deep learning architectures, and apply them on a concrete example from life sciences.

Introduction: what is deep learning and why is everyone doing it?
Overview of the main types of deep learning architectures: MLP, convolutional and transformers. When to use one or the other?
Overview of the different training and regularization techniques.
Practical session on a simplified open-research problem.