opengeospatial/geopackage

Zipped GeoPackage files and media type

Opened this issue ยท 11 comments

Is it common to zip GeoPackage files? If yes, should a media type, application/geopackage+sqlite3+zip, be registered for that at IANA?

In that way, an API conforming to OGC API - Features and the (draft) INSPIRE good practice building on OGC API - Features could link to such a zipped GeoPackage file using that media type.

"links": [
  { ... },
  { "href": "https://download.my-org.eu/buildings.zip",
    "rel": "enclosure",
    "type": "application/geopackage+sqlite3+zip",
    "title": "Download the dataset as a GeoPackage (CRS: EPSG:25832)",
    "length": 472546 },
  { ... }
  ],

See also

@heidivanparys Since HTTP already supports Content-Encoding as a mechanism to exchange the data compressed, would there still be enough value in a dedicated application/geopackage+sqlite3+zip media type?
The same +zip question kind of applies to all formats that an OGC API might deliver.

Maybe there is still value for very large file so that they can get saved directly with the .zip extension, and to save the server from having to compress it on the fly in some cases?

@heidivanparys any thoughts on @jerstlouis 's comment?

I have mixed thoughts on this. It makes sense because lots of +zip media types are registered at IANA but, at the same time, application/vnd.sqlite3+zip is not registered. Is it common to distribute sqlite3 files compressed?

Since HTTP already supports Content-Encoding as a mechanism to exchange the data compressed, would there still be enough value in a dedicated application/geopackage+sqlite3+zip media type?
The same +zip question kind of applies to all formats that an OGC API might deliver.

Maybe there is still value for very large file so that they can get saved directly with the .zip extension, and to save the server from having to compress it on the fly in some cases?

Is it common to distribute sqlite3 files compressed?

@jerstlouis @fjlopez I don't know what is common practice, but I can describe the practice at the agency where I work. One of our distribution channels is the Danish Map Supply. One of the ways you can get data from the Danish Map Supply is by downloading a dataset or a predefined subset of a dataset from the Map Supply's FTP server.

A host of predefined sections of data sets are readily available for download. These are both sections of historical data sets and sections from updated data sets that are updated regularly to reflect the newest available data. E.g. the matricular maps are updated every two months.

The FTP server stores (subsets of) datasets in different format. I had a look again, and almost all files are zipped. So the shapefiles, GML files, MapInfo files, etc. are compressed and then put on the FTP server, from where users can retrieve those zip files.

Links to those zip files, and information about their media types, are e.g. present in the Atom feeds we have as well, see e.g. https://download.kortforsyningen.dk/sites/default/files/feeds/NamedPlace.xml:

<entry xml:lang="da">
    <title>DK INSPIRE NamedPlace</title>
    <!-- ... -->
    <link
        rel="alternate"
        href="ftp://ftp.kortforsyningen.dk/atomfeeds/INSPIRE/GML/EPSG_3044/DK_NamedPlace.gml.gz"
        type="application/x-gmz"
        length="109479325"
        title="DK INSPIRE NamedPlace"
        hreflang="da"/>
    <!-- ... -->
    <id>ftp://ftp.kortforsyningen.dk/atomfeeds/INSPIRE/GML/EPSG_3044/DK_NamedPlace.gml.gz</id>
    <!-- ... -->
  </entry>

(Media type application/x-gmz is described on https://inspire.ec.europa.eu/media-types/application/x-gmz).

I have mixed thoughts on this. It makes sense because lots of +zip media types are registered at IANA but, at the same time, application/vnd.sqlite3+zip is not registered. Is it common to distribute sqlite3 files compressed?

I am not convinced that we can conclude that it is not common to distribute sqlite3 files compressed just because application/vnd.sqlite3+zip is not registered. Another explanation could be that nobody cared to register application/vnd.sqlite3+zip because there is no need to comply with a certain specification or best practice.

IMHO, The discussion on the distribution of GeoPackage as compressed files and the need for the registry of an IANA media type for such case should not be mixed:

  • I agree that GeoPackage files can be distributed zipped with a proper name(i.e. name.gpkg.zip).
  • I think that there is no need to register a specific media type because RFC 6839 3.6. The +zip structured syntax suffix defines when and how to use of +zip and hence application/geopackage+sqlite3+zip is OK as application/geopackage+sqlite3 is already registered.

RFC 6839 may explain why application/vnd.sqlite3+zip has not been registered.

I think that there is no need to register a specific media type because RFC 6839 3.6. The +zip structured syntax suffix defines when and how to use of +zip and hence application/geopackage+sqlite3+zip is OK as application/geopackage+sqlite3 is already registered.

Earlier, I made the same assumption. However, in another, similar discussion, on zipped GeoJSon files (the relevant part starting here), @cportele wrote the following in this comment:

[...] In my understanding 6839 states rules for media types with a suffix like "+zip". It does not say a suffix "+zip" may be added to any existing media type. Something like application/geo+json+zip would not be a valid media type. It would still need to be registered with IANA. [...]

I agree with you, my assumption was wrong. See this excerpt from RFC 6898 Media Type Specifications and Registration Procedures.

Media types that make use of a named structured syntax SHOULD use the
appropriate registered "+suffix" for that structured syntax when they
are registered.

Reviewing the IANA registry of structured suffixes +gzip is also registered. But it makes sense to register only application/geopackager+sqlite3+zip due to the popularity and availability of the ZIP format.

So the consensus is to ask IANA to register application/geopackage+sqlite3+zip? I just want to be sure before I move forward.

Also please keep in mind that even if the encoding is application/geopackage+sqlite3, it is still possible for the data to be compressed zipped with Accept-encoding, and unless the visualization client directly supports zipped GeoPackage, this avoids an extra step / duplication of the data compared to having to extract it.

@jerstlouis there are scenarios where having +zip is needed. For example, we can have links to GeoPackages in an Atom file that point to:

  • HTTP servers with content negotiation enabled. The link type can be application/geopackage+sqlite3 and the user agent may negotiate if the server sends the GeoPackage file compressed or not. โœ”๏ธ
  • FTP servers or HTTP servers without content negotiation enabled. Here we have three cases:
    • If the file served is not compressed, we must use application/geopackage+sqlite3. โœ”๏ธ
    • If the file served is compressed in ZIP format and the link type is application/geopackage+sqlite3, the user agent may think that the GeoPackage file is broken โŒ or it must have a method to sniff the mime type. ๐Ÿคž
    • If the file served is compressed in ZIP format and the link type is application/geopackage+sqlite3+zip, the user agent will uncompress it and then use the GeoPackage. โœ”๏ธ

Assigning to @ogcscotts to contact IANA.