/sktime-tutorial-ODSC-Europe-2023

Primary LanguageJupyter NotebookBSD 3-Clause "New" or "Revised" LicenseBSD-3-Clause

Welcome to the sktime tutorial at ODSC Europe 2023

This tutorial is about sktime - a unified framework for machine learning with time series. sktime contains alrgoithms and tools for building, applying, evaluating modular pipelines and composites for a variety of time series learning tasks, including forecasting, classification, regression.

sktime is easily extensible by anyone, and interoperable with the pydata/numfocus stack.

This is an introductory sktime half-day tutorial with:

  • a general introduction to sktime
  • forecasting with sktime - uni/multivariate, hierarchical/global
  • feature extraction, transformation pipelines, parameter tuning, autoML
  • engineering topics: interfaces, estimator and dependency management, writing sktime compatible 3rd party estimators
  • deploying sktime in production using mlflow with the mlflavours plugin

Binder

🚀 How to get started

In the tutorial, we will move through notebooks section by section.

You have different options how to run the tutorial notebooks:

  • Run the notebooks in the cloud on Binder - for this you don't have to install anything!
  • Run the notebooks on your machine. Clone this repository, get conda, install the required packages (sktime, seaborn, jupyter) in an environment, and open the notebooks with that environment. For detail instructions, see below. For troubleshooting, see sktime's more detailed installation instructions.
  • or, use python venv, and/or an editable install of this repo as a package. Instructions below.

Please let us know on the sktime discord if you have any issues during the conference, or join to ask for help anytime.

💡 Description

This tutorial presents sktime - a unified framework for machine learning with time series. sktime covers multiple time series learning problems, including time series transformation, classification and forecasting, among others.sktime allows you to easily apply an algorithm for one task to solve another (e.g. a scikit-learn regressor to solve a forecasting problem). In the tutorial, you will learn about how you can identify these problems, what their key differences are and how they are related.

sktime provides various time series algorithms and modular composition tools for pipelining, ensembling and tuning. sktime also provides API compatible interfaces to many popular libraries, such as statsmodels, prophet, statsforecast, tslearn, tsfresh, etc, which can be readily combined using sktime composition patterns.

In this tutorial, you will learn how to use, combine, tune and evaluate different algorithms on real-world data sets. The tutorial consists of step-by-step using Jupyter Notebooks.

sktime not just a package, but also an active community which aims to be welcoming to new joiners. We invite anyone to get involved as a developer, user, supporter (or any combination of these).

🎥 Other Tutorials:

👋 How to contribute

If you're interested in contributing to sktime, you can find out more how to get involved here.

Any contributions are welcome, not just code!

Installation instructions for local use

To run the notebooks locally, you will need:

  • a local repository clone
  • a python environment with required packages installed

Cloning the repository

To clone the repository locally:

git clone https://github.com/sktime/sktime-tutorial-ODSC-Europe-2023.git

Using conda env

  1. Create a python virtual environment: conda create -y -n odsc_sktime python=3.9
  2. Install required packages: conda install -y -n odsc_sktime pip sktime seaborn jupyter pmdarima statsmodels
  3. Activate your environment: conda activate odsc_sktime
  4. If using jupyter: make the environment available in jupyter: python -m ipykernel install --user --name=odsc_sktime

Using python venv

  1. Create a python virtual environment: python -m venv odsc_sktime
  2. Activate your environment: source odsc_sktime/bin/activate
  3. Install the requirements: pip install sktime seaborn jupyter pmdarima statsmodels
  4. If using jupyter: make the environment available in jupyter: python -m ipykernel install --user --name=odsc_sktime