simonpoole/mapsplit

Enable updating

Opened this issue · 3 comments

The current updating mechanism doesn't really work. It should be replaced by at least one of the following:

  • update via conventional osc diffs. This requires adding an index with the element to tiles mapping.
  • update via augmented diffs

Augmented diffs (as described here https://wiki.openstreetmap.org/wiki/Overpass_API/Augmented_Diffs ) are somewhat computationally expensive (at least when using an official Overpass release), in particular as they're only cached for a few hours on the server. Afterwards, they need to be re-calculated every time someone tries to update their local dataset, which doesn't seem feasible given limited server capacity.

Maybe it would be good to list a few assumptions, like how large the area to be updated would be, what the maximum timeframe for updates is, and also expected number of users, etc.

Given the target audience updates from "OSM" would typically not be more frequent than once per day, but even that doesn't mean that they would have to be consumed directly.

But as I mentioned in the GSOC discussion, Overpass augmented diffs are really massive overkill (and provide guarantees that are not needed), I've suggested in the past that we should really be producing a simpler version of them directly from the OSM DB which would remove most of the current issues.

Given that augmented_diffs just boil down to a simple QL query covering a one minute interval (eg. https://overpass-api.de/api/augmented_diff?id=4983215&debug=true ), updating once per day would indeed be a massive overkill.

A quick test showed that it would take more than 2.5 hours to download a day worth of data, with > 5GB uncompressed data volume to be processed. I'd try to switch to larger time intervals by using a custom QL query, thereby cutting down the number of queries, and maybe also the data volume. Still it's barely a good fit for this use case.

On my Overpass fork, you can generate 24 hourly augmented diffs in about 22 minutes, upstream/official would be 1.5 hours. Data volume is about 8.3 GB uncompressed.

osmdbt's create-diff would be a good playground to evaluate alternative options: https://github.com/openstreetmap/osmdbt/blob/master/src/osmdbt-create-diff.cpp