`encode` lossless documentation correction

Question

`encode` lossless documentation correction

Closed this issue 9 months ago · 3 comments

From encode documentation it seems that these calls should give the same result:

import numpy as np
from pydicom import dcmread
from openjpeg import encode, decode

dcm = dcmread("chest.dcm") 

with open('test_jpeg_1.j2k', 'wb') as f:
    f.write(encode(dcm.pixel_array))

with open('test_jpeg_2.j2k', 'wb') as f:
    f.write(encode(dcm.pixel_array, compression_ratios=[1]))

with open('test_jpeg_1.j2k', 'rb') as f:
    im_1 = decode(f.read())

with open('test_jpeg_2.j2k', 'rb') as f:
    im_2 = decode(f.read())

np.allclose(im_1, im_2)
# False

But they are not (the diff is small, but still). From function documentation:

"""
compression_ratios : list[float], optional. Required if using lossy encoding, this is the compression ratio to use for each layer. Should be in decreasing order (such as [80, 30, 10]) and the final value may be 1 to indicate lossless encoding should be used for that layer. Cannot be used with signal_noise_ratios.
"""

True lossless is achieved with encode(dcm.pixel_array) and encode(dcm.pixel_array, compression_ratios=[1]) results in a small residual.

*attached example CT dicom (single channel, grayscale, int16) from public data (just changed the extension to .txt, since github does not allow for .dcm), but you probably can reproduce with any other dicom.

chest.txt

Answer 1 · 2024-02-04T03:10:41.000Z

Using compression_ratios (or signal_noise_ratios) signals that you want lossy compression with DWT 9-7. Would you prefer to have compression_ratios=[1] signal lossless mode? Or just a docstring update to make it a bit clearer?

Answer 2 · 2024-02-04T16:31:54.000Z

I feel like this only requires a documentation edit, since right now it is confusing. I.e. instead of "the final value may be 1 to indicate lossless encoding" use something like "compression_ratios=[1] does not correspond to lossless encoding" since it does not, and documentation says it does.

Answer 3 · 2024-03-23T22:33:43.000Z

I ended up changing the docstring and making the following equivalent and lossless:

encode(arr)
encode(arr, compression_ratios=[1])
encode(arr, signal_noise_ratios=[0])