felt/tippecanoe

Feasibility of adding geoparquet support

mtravis opened this issue · 4 comments

Do we think there's a possibility that geoparquet support will be added at some point?

We're currently taking the Overture data and converting to flatgeobuf to use with tippecanoe like so.

for file in $filelist
do
    aws s3 cp --no-sign-request $location$file .
    time duckdb -c "install spatial; load spatial; COPY (SELECT id, st_geomfromwkb(geometry) as geom from read_parquet('$file') )  to '$file.fgb' WITH (FORMAT GDAL, DRIVER 'flatgeobuf');"
    rm $file;
done

Being able to drop this step would be great. Happy to look at ways at funding the development of this feature if it is possible.

bdon commented

What about streaming GeoJSONSeq out of DuckDB and into tippecanoe?

GeoParquet would entail adding Arrow + Parquet which would complicate the build significantly.

@bdon we're taking the global coverage and converting to fgb. Streaming via GeoJsonSeq would take even longer wouldn't it.

bdon commented

I am suggesting using gpq so that the GeoJsonSeq never touches disk, like this:

gpq convert Cairo_Governorate.parquet --to=geojson | tippecanoe Cairo.geojson -o Cairo.pmtiles

It might be faster overall then the intermediate step of writing out fgb to disk, then reading that, unless you are already streaming fgb.

@bdon thanks for the suggestion, streaming to tippecanoe via gpq is not something I had thought about.

The code snippet I provided probably paints a false picture of what we are actually doing. Currently I'm downloading the entire global dataset of Overture buildings and then converting to Flatgeobufs. I then run something like this

tippecanoe -overture-buildings -l buildings *.fgb

That's perfect for what we need. Removing the middle step would be good but it doesn't take an overly long time. If adding Geoparquet support would over complicate the build then I'll close this one for now.