mapbox/vtzero

Geometry handling: strict vs. non-strict mode

joto opened this issue · 6 comments

joto commented

Currently the geometry handling has a run-time setting strict that enables strict mode in which the library does some more tests on the geometry it got from the vector tile.

Originally I thought the user code would enable this for version 2 tiles and disable this for version 1 tiles. But I am not sure this makes sense. If the code using these geometries can handle them in non-strict mode anyway, why the extra checks? And if it can't, we'll always need those checks. Basically what those checks give you is additional postconditions on the geometry and it depends on the users code whether it can handle those postconditions, this is not something that usually depends on run-time information.

Do we actally need this? Does this have to be a run-time setting? Maybe this is something that should be a template parameter.

@flippmoke

I think we can drop strict mode currently. We almost always will want to do the best we can with a given dataset to make it return a result in the decoder. In this sense the decoder should always be very flexible if possible, the encoder should be strict if possible to prevent the creation of bad data. So I think we do not need strict in the decoder.

Thanks for this ticket @joto - really helpful to have you lay out the problem.

The value in my mind of a strict mode is to ensure vtzero can be adopted by the most types of users and help, therefore, increase adoption of this code.

So, my mind goes to the variety of usecases.

One extreme usecase where I'd want the least amount of strictness checks is when using the vtzero decoder in an application that needs to handle, without introducing breakages, existing and old v1 tilesets. The worst case scenario might be where these tilesets have layers where 100% of the polygon geometries don't adhere to the (new in v2) rule like no consecutive equal points so if vtzero threw on this condition the downstream user would need a try/catch around vtzero and the behavior on upgrade to vtzero would result in maps being blank where data was previously able to be rendered with the very non-strict https://github.com/mapbox/vector-tile/blob/e6150ec7b8b6b2752af99aa545a108847415555c/include/mapbox/vector_tile.hpp#L214-L301

The usecase where I'd want the most amount of strictness checks is when writing a vector tile validator whose job is to reject tiles that don't have geometries adhering to the v2 spec.

Do we actually need this?

Per my reflecting above, yes I think we need this to support v1 tiles.

Does this have to be a run-time setting? Maybe this is something that should be a template parameter.

Per my thinking-out-loud about uscases above, I'm not seeing this needing to be a runtime argument.

joto commented

To summarize: We have two different use cases:

  1. Normal reading of tiles should be forgiving
  2. Validating of tiles should be strict

One and the same program will probably not do both, so differentiating those use-cases can be compile-time.

Looking at the checks we are doing in strict mode here is what we got:

  1. Check that there are no two consecutive equal points.
  2. Check that the last point in a ring is not the same as the first (because ClosePath already covers that). This is basically the same as 1., just a special case for rings.
  3. LineTo command with count of < 2 for rings.
  4. Check that the area of the ring is not 0.
  • Points 1, 2, and 3 can easily be checked outside vtzero if needed.
  • For point 4, the area can be re-calculated, which would allow this check. We can also think about the vtzero interface here. Currently we calculate the area, but only return a boolean (area > 0) to indicate whether this is an inner or outer ring. We could change this to a) either return the area or b) return an enum { inner, outer, zero }; or so.

So I think it looks like we can completely remove the strict mode, just need do document that vtzero is not checking this and users need to do it themselves if they want to.

So I think it looks like we can completely remove the strict mode, just need do document that vtzero is not checking this and users need to do it themselves if they want to.

👍 Similar to the solution at #24 (comment), what does everyone think about having vtzero-show gain a --strict mode that does these checks on top of the core decoder?

joto commented

We should probably think about some general tool that does this validation, but can also do things like pack vector tiles into mbtiles files or get them out, etc. Something similar to what osmium-tool is to libosmium. We can base this in vtzero-show but I would like to keep the example programs simple to allow them to be, well, examples for vtzero use. (vtzero-show is getting too big already.)

joto commented

I have removed the strict parameter now from the geometry decoding functions (API change!) and removed those internal checks. I have added an example program vtzero-check that does all those checks and some more. This can be used as basic for more extensive checks in the future.