Cosmoglobe/Commander

Ideas for lossless Huffman compression of TOD

mreineck opened this issue · 3 comments

It may be possible to perform a lossless Huffman compression of (single precision) TOD, which could save quite significant amounts of space. However, this idea only works if

  • the value range of the data does not include 0
  • the values have a fairly small dynamic range (i.e. maxval/minval is not very far from 1).

Both conditions should be fulfilled for Planck TOD, correct?
Under these conditions, we can determine the smallest possible distance between floating point values at minval (see, e.g., https://en.wikipedia.org/wiki/Unit_in_the_last_place) and call this d. All numbers in the TOD stream are then, by construction, an exact integer mutiple of d away from minval, and the maximum integer is (maxval-minval)/d. Representing the TOD by these integers is a lossless transformation, and the integers themselves can be differenced and Huffman-compressed.
I'd expect pretty significant space savings when using this method.

Of course, as Mathew mentioned, it would be much better to use the digitized values coming out of the ADC directly. But I'm not sure how elaborate the process is to get from these values to usable TOD. It might be too expensive to do this on every decompression.

hke commented

OK, goot to know that this is already on the table! No need to keep this open, then.