reading in raw lightsheet data (CZI file)
pr4deepr opened this issue ยท 21 comments
System and Software
- aicspylibczi Version: 2.8.0
- Python Version: 3.7.10
- Operating System: Windows 10
Description
I am trying open a czi file which is raw lightsheet data, i.e., not deskewed. The deskewed data as a czi file opens fine, but the raw data (not deskewed) throws an error. The idea is to read in the raw data and perform deskewing and deconvolution in Python.
Expected Behavior
Expected it to return the czi file as a dask array
Reproduction
This is just an example code for troubleshooting. I was initially using aicsimageio directly using imread_dask and was getting the same error
from aicsimageio.readers import czi_reader
from aicspylibczi import CziFile
img='D://Pradeep//Lightsheet//skew_deskew_example/image.czi'
czi_deskew = CziFile(img)
czi_reader.CziReader._daread(img,czi_deskew)
It throws an error: *
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-2-ba37ba0ae147> in <module>
1 # Read first plane for information used by dask.array.from_delayed
----> 2 sample, sample_dims = czi.read_image(**first_plane_read_dims)
3 print(sample_dims)
~\AppData\Local\Continuum\anaconda3\envs\lightsheet\lib\site-packages\aicspylibczi\CziFile.py in read_image(self, **kwargs)
386 #print(cores)
387 #print(plane_constraints)
--> 388 image, shape = self.reader.read_selected(plane_constraints, m_index, cores)
389 #print(shape)
390 #print(image)
RuntimeError: The method or operation is not implemented.
I am not sure what this error means.
I have deskewed data generated from another source on the same data and it works really well gibing the output:
(dask.array<concatenate, shape=(119, 3, 75, 1166, 1488), dtype=uint16, chunksize=(1, 1, 75, 1166, 1488), chunktype=numpy.ndarray>,
'TCZYX')
The dimensions of the raw data are: (119, 3, 751, 150, 1488) in TCZYX format.
The dimensions of the deskewed data that works are: (119, 3, 75, 1166, 1488) in TCZYX format.
Environment
Anaconda Environment
Thanks
Pradeep
Hi @pr4deepr,
I'll look into this as soon as I have work on the 3.0 release completed.
It would be ideal to have a test file from your system that has the problematic behavior if at all possible.
It is somewhat likely that the raw/skewed data is not supported by libCZI. I might be able to patch libCZI to make that work but that's an unknown. If you can get me a small test file that would be fantastic. I'll hope to take a look at it within a week.
Thanks
@heeler
Hi @heeler
Please find the data here. Its a WeTransfer link.
I can use the czi2tif option from here: https://github.com/cgohlke/czifile
to convert small czi files into tiff files, and can access the metadata. But, its only sensible for small files...
Cheers
Pradeep
Hey @pr4deepr I just saw your talk on Dask Summit and it served as a reminder for me to check this issue ๐ (sorry for the delay)!
@heeler has unfortunately taken a new job so I will have to get myself caught up on what is going on with this issue. I am curious if you have encountered the chunking problem on other file formats. Is it just CZI or does it affect the whole aicsimageio
lib?
Excited to chat at the Dask summit life sciences workshop too!
Hey @JacksonMaxfield
I just saw your talk. I really enjoyed it and I think it answered some questions that I had about processing the large datasets and memory usage!!
_
I just saw your talk on Dask Summit and it served as a reminder for me to check this issue ๐ (sorry for the delay)!
_
Apologies, didn't mean to put up the github issue like that, I just wanted to showcase my workflow and where I'm at.
Currently, I have only tried it on CZI files. I can access the CZI file and explore the metadata, subblocks using the czifile library. I get the error only when I try to read it in using aicspylibczi or aicsimageio libraries, especially as a dask array.
We mainly use Zeiss microscopes here and particularly the Zeiss Lattice in this case. I haven't tried it on other file-formats. We have a home-built lattice , so I can try it on the tiff files that it churns out? Will that work for you?
Will be great to chat with you. Which or what time will you be attending the life science workshop?
Apologies, didn't mean to put up the github issue like that, I just wanted to showcase my workflow and where I'm at.
No worries at all. It was a helpful reminder for me and useful to hear about the issues.
Hmmm well normally I would say can you upgrade to aicsimageio 4.0.0.dev6 but CZI reading hasn't made it into that dev release yet. I tried, and we have benchmarks that show our peak memory used during reading files (and I manually ran some tests last night) to make sure that at least TIFFs we aren't reading more data into memory than asked - 4.0.0 benchmarks. If you click on AICSImageIO peakmem benchmarks. You can see that cached_array vs delayed_array are much different in MBs read during the process. But, I will continue to look into the memory issue.
Also note that from your talk, the Dask array jupyter / html repr that shows size isn't showing the size of bytes already read. Just the size of all the chunks combined.
Now, on to your current issue. I will manually give your file a go on the newest release of aicspylibczi and see if I can find anything.
And lastly, I will try to go to both life science workshop sessions but will for sure be at the first one. (May 19, 16:00 PST / May 19, 23:00 UTC).
Thanks a lot for looking into this..
The WeTransfer link expired, so I''m posting another link here:
DOWNLOAD
Had a brief moment to look at this this morning. On both the prior and new versions of aicspylibczi it produces the error you noticed. So reproducible! Yay?
What an odd error. I will try to find a time to talk to Jamie about this and see what I can do. I assume it's something to do with typing. Because the underlying reader is written in C++ I wonder if your file has a different type return for some operation which is causing it to say it has no impl for those specific types.
Yea, I had a look and realised the reader is in C++, which is where I hit a wall!!!
So, I was comparing a raw data file and the corresponding deskewed/processed data. The latter opens in aicsimageio.
I have been playing around with using CziFile to explore the underlying data structure.
From what I understand about czi files, the data are in subblocks, which are in turn contained in subblock directories.
Using info from the code here:
https://github.com/cgohlke/czifile/blob/a70265fd430983875bf4c31955f2ad57f2592747/czifile/czifile.py#L644
I can access each subblock which contains the image data. This can be accessed using data_segment()
This is my understanding of the czi file.
so, if I look at the first subblock:
czi_raw = CziFile("RAW DATA.czi")
""""Read, decode, and copy subblock data from first subblock."""
subblock =czi_raw.filtered_subblock_directory[0].data_segment()
from tifffile import FileHandle
fh_raw=FileHandle(img_raw) #handling binary files within czi files
fh_raw.seek(subblock.data_offset) #set the files current position at this sublock; set the pointer at this subblock for reading
dtype=np.dtype(subblock.dtype)
data = fh_raw.read_array(dtype, subblock.data_size // dtype.itemsize)
czi_image_raw=data.reshape(czi_raw.filtered_subblock_directory[0].stored_shape)
What information would be valuable to compare the raw and deskewed data?
BTW, are you comfortable with me posting this in the image.sc forum?
I am in a workshop with Sebastian Rhodes from Zeiss and he mentioned about posting it there.
Please do! More eyes the better probably.
Thanks for that @JacksonMaxfield and @heeler ! Appreciate you taking the time for this...
Hey @pr4deepr just pinging again to say that don't worry we are still tracking this issue but no development has occurred unfortunately still. Hoping that we can look at it soon but again, no real timeline unfortunately. Apologies.
Thanks!
Initial finding: the error message "The method or operation is not implemented." comes from the underlying Zeiss libCZI when it thinks there is an internal compression format it doesn't recognize. It recognizes "JpgXr" and "UnCompressed" according to the code. I am still looking deeper to see how it got there.
Looks like the file contains compression mode 1001 which the libCZI library doesn't recognize and considers "invalid".
Thanks for this update.
Glad to see that you've figured out why we're getting the error.
Hi
Just updating this thread. There was a bit of delay in getting my hands on some czi files.
With files saved using the newest version of Zen software (3.4 onwards), aicsimageio reads the czi files without a problem.
For older files, I need to resave it using Save As CZI option on Zen to be able to read it using aicsimageio library.
I really appreciate the rapid response and help in this matter.
Do let me know if there is anything else I need to provide
Cheers
Pradeep
Well glad it was solved. Going to close this issue for now then. If it comes up again / if any other issues crop just let us know.