/cpdetect

Python package for multiple change-point detection.

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

Change-point detection with cpdetect

PyPI version GitHub Release Date Downloads

cpdetect is a python package designed for change point-detection using statistical methods. This is a first version and considers only one change-point model which is normal mean model. This assumes normally distributed time series values and changes only in the mean (mean shift). The package offers three detection methods which are Binary Segmentation (BS), Backward Detection (BWD) and Screening and Ranking algorithm (SaRa).

Install

To install the package use

pip install cpdetect

Example usage

Let's import some useful libraries first.

import scipy.stats as sp
import numpy as np
from matplotlib import pyplot as plt

Now we can create an example time series with three change-points.

# mean shifts between 0 and 3, sigma = 2
Y1 = sp.norm.rvs(0, 2, 200)
Y2 = sp.norm.rvs(3, 2, 200)
Y3 = sp.norm.rvs(0, 2, 200)
Y4 = sp.norm.rvs(3, 2, 200)
Y = np.hstack((Y1, Y2, Y3, Y4))
plt.plot(Y)

To find the change-points location we can use BinSeg which contains binary segmentation implementation.

from cpdetect import BinSeg

bs = BinSeg()                 # creating object

bs.fit(Y, stat='Z', sigma=2)  # fitting to data

plt.plot(bs.stat_values)      # statistic plot

bs.predict(0.01)              # change-point detection

If we don't know what the standard deviation (sigma) is, we can use T statistic.

bs.fit(Y, stat='T')        # fitting to data

plt.plot(bs.stat_values)   # statistic plot

bs.predict(0.01)           # change-point detection

If we don't know the distribution of time series values, we can't use normal mean models. Then we can use bootstrap which finds the statistic distribution by itself.

bs.predict(0.01, bootstrap_samples=1000)

Libraries used

  • numpy
  • pandas
  • scipy