AllenCellModeling/aicsimageio

czi number of channels can be out of sync and is reported as error

toloudis opened this issue · 2 comments

Description

Suraj says: Hi. I'm trying to read some czi files using aicsimageio but getting a conflicting sizes for dimension 'C' error while reading (fiji can read these files). Here is the error:

ValueError: conflicting sizes for dimension 'C': length 1 on the data but length 4 on coordinate 'C'

The file contains metadata for 4 channels but the actual array data only consists of 1 channel according to Fiji/Bioformats.

Expected Behavior

I expect the file to load one channel. Perhaps a warning would be acceptable? It should not crash or raise exception - even if exception is caught, user has basically no recourse.

Our code is capturing this as an error but it seems like we should be more lenient and just use what is found in the data. If the data had more channels than the metadata, then we might consider that an error (missing channel names in metadata) but the opposite case seems to be ok.

Reproduction

'/allen/aics/assay-dev/computational/data/4DN_handoff_CTFC/structure_deconvolution/3500004565_100X_ZSD1_20210615-Scene-16-P16-G02_structure_deconvolution.czi'

@toloudis I'm confused about this part of what you said:

If the data had more channels than the metadata, then we might consider that an error (missing channel names in metadata) but the opposite case seems to be ok.

Would it not be more alarming to have missing data when we "should" have more based on the metadata as opposed to having missing metadata when we "should" have more based on the data?

Here's what I was trying to say:

  1. If there is more true data channels than metadata, then the metadata will be missing information about those data. (Do we report the file as having fewer channels, then? Or is this a major error?)
  2. (This issue) If there is more metadata channels than actual data, then at least we can load all the existing data channels and know that they all have metadata. In this scenario, we can report that the metadata has extra info but at least it's a superset of the true pixels in the file. We should be able to load this without crashing -- as long as the arrays are complete and not cut off due to some i/o error.