/fracdiff

Compute fractional differentiation (fracdiff) super-fast. Fracdiff processes time-series to be stationary while preserving its memory. Inspired by "Advances in Financial Machine Learning" by M. Prado.

Primary LanguagePythonBSD 3-Clause "New" or "Revised" LicenseBSD-3-Clause

Fracdiff: Super-fast Fractional Differentiation

python versions version CI codecov dl Code style: black

Fracdiff performs fractional differentiation of time-series, a la "Advances in Financial Machine Learning" by M. Prado. Fractional differentiation processes time-series to a stationary one while preserving memory in the original time-series. Fracdiff features super-fast computation and scikit-learn compatible API.

spx

What is fractional differentiation?

See M. L. Prado, "Advances in Financial Machine Learning".

Installation

$ pip install fracdiff

Features

Functionalities

  • fdiff: A function which extends numpy.diff to fractional orders.
  • Fracdiff: Perform fracdiff of a set of time-series. Compatible with scikit-learn API.
  • FracdiffStat: Automatically fracdiff which makes a set of time-series stationary while preserving their maximum memory. Compatible with scikit-learn API.

Speed

Fracdiff is blazingly fast.

The following graphs show that Fracdiff computes fractional differentiation much faster than the "official" implementation.

It is especially noteworthy that execution time does not increase significantly as the number of time-steps (n_samples) increases, thanks to NumPy engine.

time

The following tables of execution times (in unit of ms) show that Fracdiff can be ~10000 times faster than the "official" implementation.

n_samples fracdiff official
100 0.164 +- 0.088 20.725 +- 5.323
1000 0.162 +- 0.055 108.049 +- 3.612
10000 0.268 +- 0.085 1074.970 +- 89.375
100000 0.779 +- 0.241 12418.813 +- 453.551
n_features fracdiff official
1 0.162 +- 0.055 108.049 +- 3.612
10 0.229 +- 0.036 1395.027 +- 172.948
100 1.745 +- 0.196 13820.834 +- 821.352
1000 19.438 +- 2.093 154178.162 +- 11141.337

(Run on Macbook Air 2018, 1.6 GHz Dual-Core Intel Core i5, 16 GB 2133 MHz LPDDR3)

How to use Open In Colab

Fractional differentiation

A function fdiff calculates fractional differentiation. This is an extension of numpy.diff to a fractional order.

import numpy as np
from fracdiff import fdiff

a = np.array([1, 2, 4, 7, 0])
fdiff(a, n=0.5)
# array([ 1.       ,  1.5      ,  2.875    ,  4.6875   , -4.1640625])
np.array_equal(fdiff(a, n=1), np.diff(a, n=1))
# True

a = np.array([[1, 3, 6, 10], [0, 5, 6, 8]])
fdiff(a, n=0.5, axis=0)
# array([[ 1. ,  3. ,  6. , 10. ],
#        [-0.5,  3.5,  3. ,  3. ]])
fdiff(a, n=0.5, axis=-1)
# array([[1.    , 2.5   , 4.375 , 6.5625],
#        [0.    , 5.    , 3.5   , 4.375 ]])

Preprocessing by fractional differentiation

A transformer class Fracdiff performs fractional differentiation by its method transform.

from fracdiff import Fracdiff

X = ...  # 2d time-series with shape (n_samples, n_features)

f = Fracdiff(0.5)
X = f.fit_transform(X)

For example, 0.5th differentiation of S&P 500 historical price looks like this:

spx

Fracdiff is compatible with scikit-learn API. One can imcorporate it into a pipeline.

from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline

X, y = ...  # Dataset

pipeline = Pipeline([
    ('scaler', StandardScaler()),
    ('fracdiff', Fracdiff(0.5)),
    ('regressor', LinearRegression()),
])
pipeline.fit(X, y)

Fractional differentiation while preserving memory

A transformer class FracdiffStat finds the minumum order of fractional differentiation that makes time-series stationary. Differentiated time-series with this order is obtained by subsequently applying transform method. This series is interpreted as a stationary time-series keeping the maximum memory of the original time-series.

from fracdiff import FracdiffStat

X = ...  # 2d time-series with shape (n_samples, n_features)

f = FracdiffStat()
X = f.fit_transform(X)
f.d_
# array([0.71875 , 0.609375, 0.515625])

The result for Nikkei 225 index historical price looks like this:

nky

Other examples are provided here.

Example solutions of exercises in Section 5 of "Advances in Financial Machine Learning" are provided here.

References