sewenew/redis-protobuf

training project for me: add geobuf input/output for GeoJSON

Opened this issue · 3 comments

@sewenew let me run an idea by you:

geobuf is a common wire format for GeoJSON - isomorphic but smaller than the naive serialisation with geojson.proto I used

geobuf is a very effective in terms of serialized size but unwieldy for processing in protobuf format, hence codecs exist for GeoJSON <-> geobuf conversions in several languages

the idea would be to add a pb.get key --format geobuf option which would encode a geojson.proto object into geobuf, possibly resurrecting some code from here

this would be conditional on the key's format being a legit GeoJSON toplevel object

could work vice-versa: importing a geobuf and decode into GeoJSON

what do you think? whacky? wrong place to do this? (if so: where would be a better place?)

the idea would be to add a pb.get key --format geobuf option which would encode a geojson.proto object into geobuf,

Sorry, but I didn't get the idea quit clearly.

It seems that the Geobuf format is, in fact, a plain protobuf file, and redis-protobuf already supports converting between JSON and protobuf messages. If you have a GeoJson encoded string, you can set it as a Geobuf. Vice-versa, you can output a Geobuf object to a JSON string which conforms the GeoJson format.

wrong place to do this? (if so: where would be a better place?)

In fact, I was plan to create a redis module which can index geo object, i.e. redis-geometry. This module was plan to use geojson as output format. I think I can also take Geobuf as a format.

Regards

yes, geobuf is a protobuf wire format to transport GeoJSON. The point with geobuf is: a) transport any GeoJSON object b) compress repeated and unnecessary stuff away

at the application level you use GeoJSON, and encode/decode it with the geobuf codec - nobody uses geobuf in its JSON representation.

the encode/decode is not just a straightforward protobuf encode. There are at least 2 transforms :

  • coordinates scaled up to int and precision-limited
  • property names are collected into a per-message dictionary and replaced by indices into that array

you could deal with this at the redis-protobuf command level plus some scripting to do the compression, but it is unwieldy

result is: a geobuf message can be substantially smaller than the corresponding GeoJSON object serialized with geojson.proto

happy to add a py-geobuf example to clarify the point

Sorry for the late reply.

It looks good to me to make redis-protobuf support Geobuf format. It will be better if Geobuf has a detailed specification (or I missed it?).

However, currently I'm too busy to have time to develop this feature. If you want to contribute, you're always welcomed :)

Regards