Polygonization algorithm?
lqh-0514 opened this issue · 5 comments
I would like to know if you are going to release the Polygonization algorithm? Really looking forward that.
Or could you give some brief description on how it is implemented?
same here. do anyone have any idea ?
I'm also interested. Perhaps at this stage the answer is no?
@TimuSumisu It would be fair to reply (at least).
It's proprietary so they cannot share.. sorry to bring the bad news.
Before I dive in here, just want to be clear that I haven't worked on this project, and the following is my assumption of what the "polygonization" step consists of when these building footprints are created.
The Microsoft AI for Earth team published a blog post around the time this building footprint data set was released that outlines the process of generating building footprints using AI: https://azure.microsoft.com/en-us/blog/how-to-extract-building-footprints-from-satellite-images-using-deep-learning/
In here it basically uses AI object detection to classify the pixels in the image as "building pixels" and "background". I would assume the polygonization step that is mentioned in this repo is the process of going from these pixels to spatial polygon geometries. I suspect the following is done to achieve this:
- The pixel data is scanned until a building pixel is found. A magic wand / flood fill algorithm is used to find all connected pixels that form that building polygon. For example: https://github.com/Tamersoul/magic-wand-js This would create a rough polygon.
- Simplification can be done on the polygon to reduce noise and reduce the number of points in the polygon. https://en.wikipedia.org/wiki/Ramer%E2%80%93Douglas%E2%80%93Peucker_algorithm
- By this point the polygon will be smooth, however, since we generally expect most buildings to have square corners. So, an orthogonalization process is likely done similar what is available in some of the OSM editors: https://josm.openstreetmap.de/wiki/Help/Action/OrthogonalizeShape
- I would expect that steps 2 and 3 would be done on pixel data using simple 2D math. The final step would be to take the georeferenced image information and convert the pixel data into geospatial coordinates. If the source imagery is map tiles from one of the Microsoft map platforms (Bing Maps/Azure Maps), then the conversion is likely a Mercator projection.
Doing these steps individually will work fine, but I suspect the Microsoft maps team took combined these algorithms to optimize the process and reduce the number of times they had to loop over each array of points/pixels. Given the scale they would be running this on, every little optimization could have a huge impact, as such, I suspect the algorithm they are using for this polygonization step is very deeply embedded within their code and likely difficult to extract as a standalone algorithm for general polygonization purposes.