DerivativesDataSink not preventing gzip mtime from being set, leading to non-deterministic file checksums
effigies opened this issue · 2 comments
As reported by @glatard on Mattermost, NIfTI images (sub-032633/ses-001/anat/sub-032633_ses-001_run-1_space-MNI152NLin2009cAsym_desc-preproc_T1w.nii.gz
) are being produced with identical uncompressed contents but different compressed checksums. This is almost certainly the mtime being set because NIfTI images are being written by nibabel in this section:
niworkflows/niworkflows/interfaces/bids.py
Lines 579 to 644 in 11ac7a1
I would suggest we move to the following approach:
img = None
if <detect need to modify header>:
img = <create img>
if img is not None:
<write img deterministically>
else:
<copy file with gzip cleaning>
This is an LTS-affecting bug, so any fix should target niworkflows 1.3.4.
This solution nipy/nibabel#1023 (comment) could actually make the non-deterministic file checksum a non-issue.
lmk @effigies
Yes, if you want to use tail -c 8
as a check, that seems fine to me. I still think that a deterministic checksum is a good idea.