drivendataorg/concept-to-clinic

Continuous improvement of lungs segmentation algorithm

Opened this issue · 3 comments

The initial lungs segmentation method has been provided by the PR #133 due to the issue #120. The provided approach, however, can be further improved e.g., by reducing its trend to false positives and made the algorithm more stable. Furthermore, the ability to correctly separate the tissues such as bones, lungs, bronchial, etc. will be beneficial for a further work with data augmentation.

Possible Solutions

For example, the work of van Rikxoort et al. describes the automatic error detection method via the convex hull complement to a coastline of lungs:

Furthermore, the method provided by S. Hu et al. is aimed at junction line enhancement followed by lungs separation which I've found to be unreasonable resource consumptive though.

The ability of the bronchial / lungs separation described in the paper of T. Kitasaka et al. I guess will also be valuable as an additional instrument of data augmentation.

Volume rendered results of the ground truth (left), the proposed method (middle) and the adaptive branch tracing method (right). Yellow region indicates TP while blue region shows FP Though the bronchial lumen is interrupted by a tumor indicated by the arrow, the proposed method can extract the lumen regions beyond the tumor.

P.S. Of course, contributions like a radiologist's handcrafted annotations of anatomical structures inside CT volume will be appreciated.

Current Behavior

The lungs segmentation lied in the preprocess/lung_segmentation.py consists of the followed steps:
Step 1: Convert into a binary image.
Step 2: Remove the blobs connected to the border of the image.
Step 3: Label the image.
Step 4: Keep the labels with 2 largest areas.
Step 5: Erosion operation with a disk of radius 2. To separate lung nodules attached to blood vessels.
Step 6: Closure operation with a disk of radius 10. To keep nodules attached to a lung wall.
Step 7: Fill in the small holes inside the binary mask of lungs.

Acceptance criteria

The good measure of a lungs segmentation algorithm should be Hausdorff distance which has efficient implementation by scipy.spatial.distance.directed_hausdorff, thus the acceptance criteria will rely on the Hausdorff tests. Nonetheless, this is the long-term issue, thus different type of contribution may be accepted by this issue.

Yep, we're definitely supposed to refine the lung segmentation. Have a look at e.g. patient 0001 and the nodule at 315.04852321, 365.87447257, -116.12078059 (slice ~91):
screenshot from 2017-09-30 10-12-50
Its state is labeled as solid, which isn't included in the current lung segmentation:
screenshot from 2017-09-30 10-12-22
However, some of the slices of the nodule are included in the lung segmentation but currently this would still falsify statistics such as the predicted volume of a nodule as to be implemented in #3 .

As part of the evaluation, the Luna16 data include right and left segmented lungs with trachea which could serve as a good baseline

New issue #283 focused on nodule segmentation and volume estimation (as opposed to segmentation of the entire lung).