/simplification

Fast LineString simplification using RDP or Visvalingam-Whyatt and a Rust binary

Primary LanguagePythonMIT LicenseMIT

Build Status Build status Coverage Status

Simplification

Simplify a LineString using the Ramer–Douglas–Peucker or Visvalingam-Whyatt algorithms

Line

Installation

pip install simplification
Please use a recent (>= 8.1.2) version of pip.

Supported Python Versions

  • Python 2.7
  • Python 3.6
  • Python 3.7

Supported Platforms

  • Linux (manylinux1-compatible)
  • macOS
  • Windows 32-bit / 64-bit

Usage

import numpy as np
from simplification.cutil import simplify_coords, simplify_coords_vw, simplify_coords_vwp

# Using Ramer–Douglas–Peucker
coords = [
    [0.0, 0.0],
    [5.0, 4.0],
    [11.0, 5.5],
    [17.3, 3.2],
    [27.8, 0.1]
]

# For RDP, Try an epsilon of 1.0 to start with. Other sensible values include 0.01, 0.001
simplified = simplify_coords(coords, 1.0)

# simplified is [[0.0, 0.0], [5.0, 4.0], [11.0, 5.5], [27.8, 0.1]]

# Using Visvalingam-Whyatt
# You can also pass numpy arrays, in which case you'll get numpy arrays back
coords_vw = np.array([
    [5.0, 2.0],
    [3.0, 8.0],
    [6.0, 20.0],
    [7.0, 25.0],
    [10.0, 10.0]
])
simplified_vw = simplify_coords_vw(coords, 30.0)

# simplified_vw is [[5.0, 2.0], [7.0, 25.0], [10.0, 10.0]]

Passing empty and/or 1-element lists will return them unaltered.

But I need to ensure that the resulting geometries are valid

You can use the topology-preserving variant of VW for this: simplify_coords_vwp. It's slower, but has a far greater likelihood of producing a valid geometry.

But I Want to Simplify Polylines

No problem; Decode them to LineStrings first.

# pip install pypolyline before you do this
from pypolyline.cutil import decode_polyline
# an iterable of Google-encoded Polylines, so precision is 5. For OSRM &c., it's 6
decoded = (decode_polyline(line, 5) for line in polylines)
simplified = [simplify_coords(line, 1.0) for line in decoded]

How it Works

FFI and a Rust binary

Is It Fast

I should think so.

What does that mean

Using numpy arrays for input and output, the library can be reasonably expected to process around 2500 1000-point LineStrings per second on a Core i7 or equivalent, for a 98%+ reduction in size.
A larger LineString, containing 200k+ points can be reduced to around 3k points (98.5%+) in around 50ms using RDP.

This is based on a test harness available here.

Disclaimer

All benchmarks are subjective, and pathological input will greatly increase processing time. Error-checking is non-existent at this point.

License

MIT