felt/tippecanoe

How much can I gain by shortening attribute names?

Closed this issue ยท 2 comments

Hello!

I have a dataset of around five million points and I'd like to keep the tiles as small as possible. Each point has around six attributes with readable, descriptive names like:

  • download_speed_maximum
  • download_speed_minimum

My question is โ€” how much could I stand to gain in terms of keeping the tiles small if I renamed the attributes in the source GeoJSON to something much shorter, like:

  • download_speed_max => dmax
  • download_speed_min => dmin

Are there already optimizations for attribute names when the tiles get encoded? I don't know protobufs well, but wondering if there's already sort of normalization that happens (e.g. download_speed_max gets encoded to a1 internally โ€” my imagination is running wild here ๐Ÿ˜†)

Thank you in advance for any tips on this!

bdon commented

Likely very little - the MVT design stores the attribute name only once per tile and references it via an integer index: https://github.com/mapbox/vector-tile-spec/blob/master/2.1/README.md#41-layers

Each feature in a layer (see below) may have one or more key-value pairs as its metadata. The keys and values are indices into two lists, keys and values, that are shared across the layer's features.

Each element in the keys field of the layer is a string. The keys include all the keys of features used in the layer, and each key may be referenced by its positional index in this set of keys, with the first key having an index of 0. The set of keys SHOULD NOT contain two or more values which are byte-for-byte identical.

Just the info I was looking for โ€” thank you @bdon !