oncoray/mirp

Speed up import and processing of DICOM files

Closed this issue · 1 comments

Processing of DICOM files is comparatively slow. I have already reduced time (locally) by half, but some bottlenecks remain:
combined

What should still be done:

  • Merge all _get_limited_metadata tags because we don't know the object class until the modality is read, which requires DICOM metadata. Its generally cheaper to read these once.
  • Prevent check() from removing metadata.
  • Pass metadata and is_limited_metadata in create().

@MaEmily

I currently have reduced the amount of file reading to get metadata to the minimum. Additional gains could be reached by optimising how pydicom.dcmread utilises its specific_tags arguments.

My current impression is that pydicom.dcmread still reads all tags internally in pydicom.filereader.data_element_generator. The following optimisation may be possible:

  • Stop the generator after the last tag in specific_tags is found. This closes the file, and would allow for reading a single tag to determine modality of DICOM files when scanning a directory.
  • Limit the amount of stuff done in pydicom.filereader.read_sequence and pydicom.filereader.read_sequence_item if the tag is not in specific_tags.
  • In case specific_tags is set and a tag is found, pop that tag (and any smaller tags -- they may not be present in the DICOM file) from the list of tags, and stop the generator if the set is now empty.

combined