`encode` lossless documentation correction
Closed this issue · 3 comments
From encode
documentation it seems that these calls should give the same result:
import numpy as np
from pydicom import dcmread
from openjpeg import encode, decode
dcm = dcmread("chest.dcm")
with open('test_jpeg_1.j2k', 'wb') as f:
f.write(encode(dcm.pixel_array))
with open('test_jpeg_2.j2k', 'wb') as f:
f.write(encode(dcm.pixel_array, compression_ratios=[1]))
with open('test_jpeg_1.j2k', 'rb') as f:
im_1 = decode(f.read())
with open('test_jpeg_2.j2k', 'rb') as f:
im_2 = decode(f.read())
np.allclose(im_1, im_2)
# False
But they are not (the diff is small, but still). From function documentation:
"""
compression_ratios : list[float], optional. Required if using lossy encoding, this is the compression ratio to use for each layer. Should be in decreasing order (such as [80, 30, 10]
) and the final value may be 1
to indicate lossless encoding should be used for that layer. Cannot be used with signal_noise_ratios
.
"""
True lossless is achieved with encode(dcm.pixel_array)
and encode(dcm.pixel_array, compression_ratios=[1])
results in a small residual.
*attached example CT dicom (single channel, grayscale, int16) from public data (just changed the extension to .txt
, since github does not allow for .dcm
), but you probably can reproduce with any other dicom.
Using compression_ratios
(or signal_noise_ratios
) signals that you want lossy compression with DWT 9-7. Would you prefer to have compression_ratios=[1]
signal lossless mode? Or just a docstring update to make it a bit clearer?
I feel like this only requires a documentation edit, since right now it is confusing. I.e. instead of "the final value may be 1 to indicate lossless encoding" use something like "compression_ratios=[1]
does not correspond to lossless encoding" since it does not, and documentation says it does.
I ended up changing the docstring and making the following equivalent and lossless:
encode(arr)
encode(arr, compression_ratios=[1])
encode(arr, signal_noise_ratios=[0])