Speed up import and processing of DICOM files
Closed this issue · 1 comments
alexzwanenburg commented
Processing of DICOM files is comparatively slow. I have already reduced time (locally) by half, but some bottlenecks remain:
What should still be done:
- Merge all
_get_limited_metadata
tags because we don't know the object class until the modality is read, which requires DICOM metadata. Its generally cheaper to read these once. - Prevent
check()
from removing metadata. - Pass
metadata
andis_limited_metadata
increate()
.
alexzwanenburg commented
I currently have reduced the amount of file reading to get metadata to the minimum. Additional gains could be reached by optimising how pydicom.dcmread
utilises its specific_tags
arguments.
My current impression is that pydicom.dcmread
still reads all tags internally in pydicom.filereader.data_element_generator
. The following optimisation may be possible:
- Stop the generator after the last tag in
specific_tags
is found. This closes the file, and would allow for reading a single tag to determine modality of DICOM files when scanning a directory. - Limit the amount of stuff done in
pydicom.filereader.read_sequence
andpydicom.filereader.read_sequence_item
if the tag is not inspecific_tags
. - In case
specific_tags
is set and a tag is found, pop that tag (and any smaller tags -- they may not be present in the DICOM file) from the list of tags, and stop the generator if the set is now empty.