felt/tippecanoe

Low CPU usage for Tippecanoe on EC2

Opened this issue · 6 comments

Currently attempting to run Tippecanoe on EC2 with a very large geojson file (~110GB), this file takes too long to progress (an indefinite amount that is at least more than 48 hours). After a lot of searching for root causes, once I ran 'top', I noticed Tippecanoe was using only 1 of 16 provided cores. When I ran 'top' locally, on mac os, Tippecanoe was using 7 cores, which would explain why it was so much faster locally.
After playing around with Tippecanoe and reading your docs, I noticed the TIPPECANOE_MAX_THREADS argument, I set the threads to 16, one per core, and it seems like this briefly raised the CPU usage to 16, but after it gets to 99.9% reading, the cpu usage drops to only 1 core, this causes the job completion to take days.
Do you have any recommendations or help that you could provide in debugging this issue?

Try converting the Geojson to Flatgeobuf using ogr2ogr and then running Tippecanoe.

FGBs are smaller, stream quicker and runs jobs in parallel by default.

Hope that helps

Matt

No problem. tile-join only works on mbtiles so I don't think you'd see any improvement there.

@mtravis Thanks for the suggestion, my concern is at the end of the day i'd have to run Tippecanoe on the EC2 instance with a single core anyways. I have done a good amount of testing on the instance and it seems like this is related to a Tippecanoe implementation. Currently looking through the source code for a possible bug.

Wondering if @e-n-f has any insights on this ? Seems like the TIPPECANOE_MAX_THREADS argument simply isn't forcing more cpu usage. I can confirm the docker container has 16 cores available.

Tippecanoe will generally use as many CPUs as are available, even if TIPPECANOE_MAX_THREADS is not set, but there are a few parts of tippecanoe that are inherently single-threaded: feature reordering after ingestion and before tiling is limited by I/O speed, and most of processing the z0 tile is a single thread since there is only one tile in the zoom level.

Are there any log messages visible at the point where it is stuck? "Reordering geometry?" "Merging vertices?"

Can you share a copy of the GeoJSON file so I can try to reproduce the problem?