ccpem/mrcfile

FEI extended header implementation different from paper

Closed this issue · 8 comments

The current implementation of the FEI extended header follows the specification in the EPU manual as linked in the MRC2014 home page.

However, the paper has a completely different version of the FEI extended header:

These files currently include a 128 bytes extended header entry per image, located between the main header and the data block, which includes additional metadata such as tilt angles, stage position, exposure time, etc. The extended header has a fixed size of 1024 entries if NZ ⩽ 1024 and 8192 entries if NZ > 1024.

This is the extended header as used in tomography, and unfortunately, opening files with such a header currently fails with mrcfile.

I'd suggest that the package implement the extended header as outlined in the paper, and that other FEI extended headers (like the one in EPU) get their own exttyp.

Thanks for pointing this out. It's definitely a problem if mrcfile fails to open files from FEI tomography software. Could you send me an example of a file that won't open?

I should mention that your quote from the paper is from section 3, which described the various non-standard extensions to MRC that were known to be in use when the paper was written. Those extensions don't form part of the specification because they were not all adopted in the MRC2014 standard, which is described in section 4.

Of course, it's up to ThermoFisher to decide how they want to handle the extended headers in their own software, but it does seem confusing to use different extended header formats with the same exttyp indicator. I'd suggest it might be sensible to use FEI1, FEI2 etc for different extended header types, and if ThermoFisher can publish the specifications then I'd be happy to support them in mrcfile.

@colinpalmer Thanks for the clarification. I can't seem to find publicly available tiltseries that obviously use the FEI extended header, but the source of error is that the tomography extended header for the file I tried has a size of 128 * 1024 = 131072 and the checking code sees that 131072 / 768 is not an integer. Here, 768 == mrcfile.dtypes.FEI_EXTENDED_HEADER_DTYPE.itemsize.

However, I tested some more files, and they work fine. They were newer and acquired with HAADF-STEM. The non-working file is an older BF-TEM dataset. I'll have to check some more files to find the root cause, whether it's due to modality, application or date of generation.

But anyhow, it would be nice if these (older) file types were supported as well. Maybe one can do a simple check to determine which type it is, like check if the size is 131072 and probe one or two values.

I should mention that your quote from the paper is from section 3, which described the various non-standard extensions to MRC that were known to be in use when the paper was written. Those extensions don't form part of the specification because they were not all adopted in the MRC2014 standard, which is described in section 4.

Does that imply that the extended headers are not at all part of the spec?

OK, thanks for the explanation. I'll see if I can generate a fake example to use in testing, but it'd still be helpful to use a real file if you can find one.

Do the newer HAADF-STEM files still use an extended header? Have they adopted the EPU format?

Does that imply that the extended headers are not at all part of the spec?

My understanding is that the existence of the extended header is part of the spec (so exttyp should be one of the values listed in Table 2 in the paper, and nsymbt should indicate its size) but the format and details of any data in the extended header are the responsibility of the various different packages, and are not defined or restricted by the spec.

Do the newer HAADF-STEM files still use an extended header? Have they adopted the EPU format?

It seems so. But I'll ask around and try to gather some data on the history and the situation with the legacy headers. Maybe the only files with FEI1 header type are indeed the ones that follow the EPU format.

In any case, it would be nice if mrcfile.open() didn't fail with files that declare an incorrect extended header type. Maybe the permissive=True switch could make that error a warning, and users could force a specific header type in a public read_extended_header() method?

This is now fixed. If a file has an extended header but the type doesn't seem to match, it falls back to a void dtype for the extended header, issues a warning, and opens the file without error.

I don't think there's any need for a specific read_extended_header() method - the extended header is always read in any case, so if you know the type you want, you can just set the dtype of the extended header array yourself.

Great, thanks for the quick fix!

Regarding the extended header format, people tell me that it's as I assumed, namely that files which use FEI1 do use the EPU style extended header, not the old one that is found in old files. Those indeed have a blank exttyp field. So there's no urgent need to do anything.

Fine, thanks for the explanation. I wonder where your problem file came from, then? Anyway, the code should handle it now - thanks for prompting me to make the improvement!

Probably the header was corrupted anyhow.