/dope

Douglas-Peucker line simplification

Primary LanguagePythonMIT LicenseMIT

DoPe

Douglas-Peucker line simplification (data reduction).

Reduces the number of points in a two-dimensional dataset, while preserving its most striking features.

The resulting dataset is a subset of the original dataset.

Although line simplification is typically used for geographical data, e.g. when zooming a digital map (see e.g. Django's GEOSGeometry.simplify() based on GEOS), this type of algorithm can also be applied to general data reduction problems, as an alternative (or addition) to conventional filtering or subsampling. Some examples:

  • creating miniature data plots
  • pre-processing time-series data for feature detection (e.g. peak detection)

Installation

Normal installation:

pip install dopelines

With plot support (adds matplotlib):

pip install dopelines[plot]

With development tools:

pip install dopelines[dev]

Note: The PyPi project is called dopelines instead of dope, because PyPi would not let us create a project named dope, even though the name appears to be available.

Example

from dope import DoPeR

data_original = [
    [0, 0], [1, -1], [2, 2], [3, 0], [4, 0], [5, -1], [6, 1], [7, 0]
]

dp = DoPeR(data=data_original)

# use tolerance threshold (i.e. max. error w.r.t. normalized data)
data_simplified_eps = dp.simplify(tolerance=0.2)

# compare original data and simplified data in a plot
dp.plot()

# or use maximum recursion depth
data_simplified_depth = dp.simplify(max_depth=2)

Example line simplification plot.

Also see examples in tests.

Limitations

Currently we only offer a recursive implementation (depth-first), which is intuitive, but may not be the most efficient solution. An iterative implementation is in the works (breadth-first).

References:

Douglas DH, Peucker TK. Algorithms for the reduction of the number of points required to represent a digitized line or its caricature. Cartographica: the international journal for geographic information and geovisualization. 1973 Dec 1;10(2):112-22.