Training material for the Sacl-AI 4 sciences workshop
Goal: Introduce the basics of ML and describe in details how to perform validation
- History and terminology
- Problem setup for ML basics (Model, loss, learning procedure, features)
- Generalization in ML (overfitting, underfitting and model selection)
- Validation (performance metrics, validation strategies, statistical analysis)
Goal: Introduce the scikit-learn
API, with a focus on practical insights on the model validation and selection.
- Overview of a simple cross-validation scheme k-fold
- Overview of metrics (Regression, Classification)
- Model selection through SearchCV
- Cross validation in complex settings (stratification, groups, non-iid data)
Goal: Introduce the different types of data, with a focus on time-series, and the different methodologies to apply on each type.
- Overview of the different types of data: tabular data, time series, images, graph, signals.
- Overview of the specific problems and jargon with time series and signals.
- How to get back to a "classical" ML framework?
- Practical illustrations with time series.
Goal: Describe the main types of deep learning architectures, and apply them on a concrete example from life sciences.
- Introduction: what is deep learning and why is everyone doing it?
- Overview of the main types of deep learning architectures: MLP, convolutional and transformers. When to use one or the other?
- Overview of the different training and regularization techniques.
- Practical session on a simplified open-research problem.