/ts-eval

Time Series analysis and evaluation tools

Primary LanguagePythonMIT LicenseMIT

ts-eval Time Series analysis and evaluation tools

pypi Build Status codecov python3 Code style: black License: MIT Contributions welcome


A set of tools to make time series analysis easier.

๐Ÿงฉ Current features

  • N-step ahead time series evaluation โ€“ using a Jupyter widget.
  • Friedman / Nemenyi rank test (posthoc) โ€“ to see which model statistically performs better.
  • Relative Metrics โ€“ rMSE, rMAE + Forecasted Value analogues.
  • Prediction Interval Metrics โ€“ MIS, rMIS, FVrMIS
  • Fixed fourier series generation โ€“ fixed in time according to pandas index
  • Naive/Seasonal models for baseline predictions (with prediction intervals)
  • Statsmodels n-step evaluation โ€“ helper functions to evaluate n-step ahead forecasts using Statsmodels models or naive/seasonal naive models.

๐Ÿ‘ฉ๐Ÿพโ€๐ŸŽจ Widget Preview

In:

TSMetrics(target, sm_seas, default)
.use_reference(snaive)
.for_horizons(0, 1, 5, 23)
.for_time_slices(time_slices.all, time_slices.weekend)
.with_description()
.with_prediction_rankings(mtx.FVrMSE, mtx.FVrMIS)
.with_predictions_plot()
.show()

Out: Demo Screenshot

๐Ÿ‘ฉ๐Ÿพโ€๐Ÿš’ Demo

For a more elaborate example, please check out the Demo Notebook.

Alternatively, check out interactive Binder demo

๐Ÿคฆ๐Ÿพโ€ Motivation

While working on a long term time series analysis project, I had a need to summarize and store performance metrics of different models and compare them. As it's daunting to do this across dozens of notebooks, I huddled over some code to do what I want in a few lines of code.

๐Ÿ‘ฉ๐Ÿพโ€๐Ÿš€ Installation

  pip install ts-eval

๐Ÿ“‹ Release Planning:

  • Release 0.3
    • remove collection of deps in style [tests_and_bla_bla] to [tests,bla]
    • links to papers โ€“ AvgRelMAE (Davydenko and Fildes, 2013); link to Nemenyi paper / implementations
    • make graphs with PIs more narrow on 0,1,.. steps as there's too much space left (with an option to turn this off).
    • better API for the end user โ€“ minimize interaction with xarray
    • pep517 build / wheels / better setup.py as per Hynek
    • travis: add 3.8 default python when it's available
    • docs: supported metrics & API options
    • Maybe use api like Summary in statsmodels MLEModel class, it has extend methods and warn/info messages
    • pretty legend for lots like here https://studywolf.wordpress.com/2017/11/21/matplotlib-legends-for-mean-and-confidence-interval-plots/
    • Look for TODOs
    • changable colors
    • turn off colored display option
    • a nicer API for raw metrics container
    • codacy badge
    • boxplots to compare models (as an alternative)
    • violin plots to compare predictions โ€“ areas can be colored, different metrics on left and right (like relative...)
  • Release 0.4
    • holiday/fourier features model
    • fix viz module to have less of important stuff
    • a gif with project visualization
    • check shapes of input arrays (target vs preds), now it doesn't raise an error
    • Baseline prediction using target dataset (without explicit calculation, but losing some time points)

๐Ÿ’ก Ideas

  • components
    • Graph: Visualize outliers from confidence interval
    • Multi-comparison component: scikit_posthocs lib or homecooked?
    • inspect true confidence interval coverage via sampling (was done in postings around bayesian dropout sampling)
    • xarrays: compare if compared datasets are actually equal (offets by dates, shapes, maybe even hashing)
    • bin together step performance, like steps 0-1, 2-5, 6-12, 13-24
    • highlight regions using a mask (holidays, etc.)
    • option to view interactively points using widget (plotly)?
    • diagnostics: bias to over / underestimate points
    • animated graphs for change in seasonality
  • features
    • example notebook for fourier?
    • tests for fourier
    • nint generation
  • utils:
    • model adaptor (for different models, generic) which generates 3d prediction dataset. For stastmodels using dyn forecast or kalman filter
    • future importance calculator, but only if I can manipulate input features
    • feature selection using PACF / prewhiten?
  • project
  • sMAPE & MASE can be added for the jupyter evaluation tables
  • ? Residual stats: since I have residuals => Ljung-Box, Heteroscedasticity test, Jarque-Bera โ€“ like in statsmodels results, but probably these stats were inspected already by the user... and on which step should they be computed then?

See also

๐Ÿคน๐Ÿผโ€โ™‚๏ธ Development

Recommended development workflow:

pipenv install -e .[dev]
pipenv shell

The library doesn't use Flit/Poetry, so the suggested workflow is based on Pipenv (as per pypa/pipenv#1911). Pipfile* are ignored in the .gitignore.