tilezen/mapbox-vector-tile

Polygon validity and performance

zerebubuth opened this issue · 5 comments

Making sure that polygons are valid (which we're not even doing 100% at, see #95 & #87) takes up, on average, about 60% of the time it takes to format an MVT tile. Formatting MVT tiles takes up around 21% of total runtime, and formatting tiles in general takes around 50% of all runtime.

We know that polygon validity is hard, but we need it for V2 #42. We also know that performance could be better #72. Currently, we recurse trying to make the polygon valid #76, which can account for a lot of time for polygons which flip-flop between being valid with fractional coordinates and invalid when we round them to integers.

There are other bits of software we could use to do the polygon validity checking, e.g: prepair or CGAL, or we could do a better job of ensuring polygon validity while rebuilding (while avoiding known issues). A neat method that someone else invented, but I've been unable to find a source for, would be to render the polygon at the integer resolution we want for MVT, then run marching squares to recover a polygon.

@zerebubuth I am well versed in this problem and more then willing to help guide you in any way possible here. Even if you need to have a phone call or something definitely willing to help out. It is in the best interest of all the encoders if we are attempting to solve this together!

Great, thanks!

What do you do to make valid polygons? Perhaps there's a small bit of software we can extract and reuse?

@zerebubuth we developed https://github.com/mapbox/wagyu specifically to address this problem. Its not a small bit of software. Making OGC valid polygons can be really tough in some crazy situations, almost any algorithm isn't perfect and I promise you will take a massive lift to develop on your own. Doing this in pure python would likely be very slow as well. I have considered for a long time what we could possibly do to have these common problems all solved in one compiled language and distributed in different languages. It seems like the best solution to me. Additionally, this puts more eyes on the code so we could do better to solve problems together.

Thanks, we'll definitely take a look at wrapping that and seeing if we can re-use it.

The README describes Wagyu as a fork / evolution of Clipper, which we already use. Seems like the topology correction step is the extra bit we're missing, since Clipper doesn't handle that for us. We have something which tries to do that, but sometimes fails.

in my issues #103 to solve like this

for example:

a polygon in geojson >>

{
    "type": "Feature",
    "geometry": {
        "type": "Polygon",
        "coordinates": [
            [
                [0, 0],
                [10, 0],
                [0, 10],
                [0, 0]
            ],
            [
                [15, 0],
                [15, 15],
                [30, 0],
                [15, 0]
            ]
        ]
    }
}

you must make it to MultiPolygon like this

{
    "type": "Feature",
    "geometry": {
        "type": "MultiPolygon",
        "coordinates": [
            [[
                [0, 0],
                [10, 0],
                [0, 10],
                [0, 0]
            ]],
            [[
                [15, 0],
                [15, 15],
                [30, 0],
                [15, 0]
            ]]
        ]
    }
}