cleder/pygeoif

Simplify line

cleder opened this issue · 3 comments

via ChatGPT:
The Ramer-Douglas-Peucker algorithm is a line simplification algorithm used to reduce the number of vertices in a curve while preserving its shape as much as possible. The algorithm works by recursively dividing the curve into smaller line segments and approximating each segment with a line. It then calculates the perpendicular distance between each vertex and the approximating line, and if the distance is greater than a user-defined tolerance, the vertex is considered significant and retained. Otherwise, it is discarded. The algorithm continues to recursively simplify the curve until all vertices are either retained or discarded. The resulting simplified curve has fewer vertices and is less complex, which can be useful for visualization, analysis, and storage of geographic data.

def rdp(points, epsilon):
    """Ramer-Douglas-Peucker algorithm to simplify a line.

    Args:
        points (list): A list of (x, y) tuples representing a line.
        epsilon (float): Maximum distance between a point and the simplified line.

    Returns:
        A simplified list of (x, y) tuples.
    """
    if len(points) <= 2:
        return points

    # Find the point with the maximum distance from the line
    dmax = 0
    index = 0
    for i in range(1, len(points)-1):
        d = point_to_line_distance(points[i], points[0], points[-1])
        if d > dmax:
            index = i
            dmax = d

    # If the maximum distance is greater than epsilon, recursively simplify the line
    if dmax > epsilon:
        # Simplify the line before the index
        simplified1 = rdp(points[:index+1], epsilon)
        # Simplify the line after the index
        simplified2 = rdp(points[index:], epsilon)
        # Combine the simplified lines
        simplified = simplified1[:-1] + simplified2
    else:
        simplified = [points[0], points[-1]]

    return simplified

def point_to_line_distance(point, start, end):
    """Calculate the distance between a point and a line.

    Args:
        point (tuple): A (x, y) tuple representing the point.
        start (tuple): A (x, y) tuple representing the start of the line.
        end (tuple): A (x, y) tuple representing the end of the line.

    Returns:
        The distance between the point and the line.
    """
    x, y = point
    x1, y1 = start
    x2, y2 = end
    numerator = abs((y2-y1)*x - (x2-x1)*y + x2*y1 - y2*x1)
    denominator = ((y2-y1)**2 + (x2-x1)**2)**0.5
    return numerator / denominator

To use this function, pass a list of (x, y) tuples representing the line and the maximum distance epsilon. For example:

>>> points = [(0, 0), (1, 1), (2, 2), (3, 1), (4, 0)]
>>> simplified = rdp(points, 0.5)
>>> print(simplified)
[(0, 0), (2, 2), (4, 0)]

via ChatGPT:
The Visvalingam-Whyatt algorithm is a line simplification algorithm that aims to reduce the number of vertices in a polyline or polygon while preserving its shape and visual appearance.

The algorithm works by assigning a "importance" value to each vertex based on the area of the triangle formed by that vertex and its two adjacent vertices. Vertices with the smallest importance values are then iteratively removed from the polyline or polygon until a target number of vertices is reached.

The Visvalingam-Whyatt algorithm is often used in cartography and geographic information systems (GIS) to simplify complex geographical features such as coastlines or road networks while still retaining their essential shape and structure.

def simplify_vw(points, threshold):
    # Compute the effective area (triangle formed by the adjacent points)
    areas = []
    for i in range(1, len(points) - 1):
        a, b, c = points[i-1:i+2]
        areas.append(abs((b[0] - a[0]) * (c[1] - a[1]) - (c[0] - a[0]) * (b[1] - a[1])))

    # Compute the simplification priority (smallest area)
    priorities = dict(zip(range(1, len(points) - 1), areas))
    sorted_priorities = sorted(priorities.items(), key=lambda x: x[1])

    # Simplify the line
    while sorted_priorities:
        index, priority = sorted_priorities.pop()

        if priority > threshold:
            break

        del points[index]

        if index > 1:
            areas[index-2] = abs((points[index-2][0] - points[index-1][0]) * (points[index][1] - points[index-2][1]) - (points[index][0] - points[index-2][0]) * (points[index-1][1] - points[index-2][1]))
            priorities[index-1] = areas[index-2]
            sorted_priorities = sorted(priorities.items(), key=lambda x: x[1])

    return points

This function takes two arguments: points, which is a list of (x, y) tuples representing the points in the line, and threshold, which is the simplification threshold. The function returns a simplified list of points.

The function first computes the effective area of each triangle formed by three adjacent points in the line. It then assigns a simplification priority to each point, based on the area of the triangle it belongs to. The points with the smallest priorities are simplified first.

The function then iterates through the priorities, starting with the point with the smallest priority. If the priority is greater than the threshold, the function stops simplifying the line. Otherwise, it removes the point with the current index from the line. It then updates the effective area and simplification priority of the two adjacent points, and re-sorts the priorities. The function repeats this process until it reaches a point with priority greater than the threshold, or until it has simplified all the points in the line.