phoboslab/qoi

Consider versioning the header

nigeltao opened this issue · 9 comments

The header is currently 12 bytes:

struct qoi_header_t {
  char [4];              // magic bytes "qoif"
  unsigned short width;  // image width in pixels
  unsigned short height; // image height in pixels
  unsigned int size;     // number of data bytes following this header
};

Endianness is already discussed in #10.

If the file format isn't set in stone yet, consider:

  • changing the 4-byte magic so that it's not ASCII text, so that accidental matches are less likely. This is part of why the PNG magic header starts with an 0x89. A suggestion: [0x71 0xf8 0x69 0x66] is invalid UTF-8 but "qøif" in Latin-1.
  • widening the width and height to 24 or 32 bits. 65536 pixel wide images might seem "impractically large" right now but it's less than an order of magnitude more than what my phone camera can produce, and it wasn't so long ago that 640x480 was considered "high resolution".
  • adding a version number somewhere, to enable future extensions to the format.

You might find some inspiration in NIE's 16 byte header:
https://github.com/google/wuffs/blob/main/doc/spec/nie-spec.md

There's also, as mentioned in the Hacker News discussion, the idea of re-using the IFF / RIFF container format.

Versioning field it's a must. Specially if in a future, you decide to change something that breaks backwards compatibility.

sbrki commented

It would also be smart to add info about number of channels (#16).
I'd also consider placing a few bytes as reserved for future use.

I don't think reserving bytes makes sense. if you have a version in the header, you don't need exra bytes. what's a scenario where an old version of the library can read an image made with a newer incompatible version because of the extra bytes?

sbrki commented

I think there should be some sort of reserved bytes. Not much, a byte or two is probably enough.

What if, down the road, it is decided to also support Hilbert curve traversal alongside the current "row by row" traversal? It would not make much sense to create a new parallel version with only difference being the traversal strategy. That is just one example.

I think there should be some sort of reserved bytes. Not much, a byte or two is probably enough.

I agree with @oscardssmith that reserving bytes aren't necessary. As long as the version number is early enough in the byte stream, you can just say that e.g. "version 1 header is 12 bytes, version 2 header is 16 bytes". You don't need to pad the version 1 header out to 16 bytes just in case. And if it turns out that version 3 needs more than you've reserved, then you're going to have to do something like that anyway.

One option, if you want a version 1 library to fall back gracefully on a version N file (for N > 1), is for the header to contain its own length (as well as or in place of a version). For example, have two bytes at a fixed offset that hold the header length. A value of 12 means version 1. A value of 16 means version 2. Etc. This is essentially what the BMP file format does, having evolved over decades, backwards and forwards compatibly. BMP also started off conceptually as a very simple image file format.

sbrki commented

@nigeltao hmm, that does make sense. Reserved bytes might not be a good idea after all.

What about a field to define the header's size along with one for possible future extensions?

struct qoi_header_t {
  char     magic[4];   // magic bytes "qoif"
  uint32_t width;      // image width in pixels (BE)
  uint32_t height;     // image height in pixels (BE)
  uint8_t  channels;   // must be 3 (RGB) or 4 (RGBA)
  uint8_t  colorspace; // a bitmap 0000rgba where
                       //   - a zero bit indicates sRGBA, 
                       //   - a one bit indicates linear (user interpreted)
                       //   colorspace for each channel
};

Looks like we have a version number after all! :-)

It's hidden in the channels high 5 bits. Zero means "version 1" (and a 14-byte header). Non-zero means a later version (e.g. header length is then written to the 15th and 16th bytes).

We also have a version number in the qoif header. I'm more in favor of having few, large change versions.