Automatic MS lesions segmentation in the spinal cord with CanProCo dataset

About me

Hi! I'm a PhD student at Aix Marseille Université and Polytechnique de Montréal. My background is in Biomedical Signal and Image Processing and Neurosciences. For the brainhack school, I would like to interact with the MRI data of multiple sclerosis (MS) CanProCo, being one of the most extensive databases of MS lesions in the brain and spinal cord. Focusing on the application of state-of-the-art deep learning approaches for biomedical image segmentation and then performing a replicability test on 3T and 7T MP2RAGE images from the CRMBM Lab.

_{Nilser LAINES MEDINA}

Project Summary

Introduction

Multiple Sclerosis (MS) is a disabling disease of the brain and spinal cord (SC) characterized by presence of lesions. Studies demonstrated the added value of 3T MP2RAGE sequence and the advantages of 7T imaging. Manual lesion identification is a slow process subject to inter-rater variability and the automatic MS lesion segmentation in the SC has some initiatives.

My project aims to develop a multiple sclerosis (MS) lesion in the spinal cord (SC) segmentation algorithm in a multicentric database (CanProCo). Through novel deep learning architectures performing for medical imaging such as nnUNET. The model will be evaluated on external data, and on other contrasts such as MP2RAGE and other magnetic fields such as 7T.

Main Objectives

Interact with CanProCo data
Provide a preprocessing pipeline for training/testing nnU-net in the spinal cord.
Train segmentation models of MS lesions in the spinal cord.
Replicability test on MP2RAGE contrast (CRMBM, Marseille data).
To propose improvement paths for the automatic segmentation of MS lesions

Tools & Methods

Pre processing: Spinal Cord Toolbox, ivadomed
Deep Learning: 2D nnU-net
Data analyse: Matplotlib, seaborn

Data

Canadian Prospective Cohort study to understand progression in MS: CanProCo

CanProCo Data: The dataset consists of 3T MRI data from 52 healthy controls (HC) and 393 subjects with multiple sclerosis (MS).

CRMBM Data: The dataset consists of 40 subjects with multiple sclerosis (MS) acquired at 3T MP2RAGE.

CanProCo MS Dataset Description (N=393)

Center	Manual Segmentation	PSIR	STIR	T2w	Total
Calgary	Lesion segmentation		82		82
	SC segmentation			82
Edmonton	Lesion segmentation	59			59
	SC segmentation			59
Montreal	Lesion segmentation	94			94
	SC segmentation			94
Toronto	Lesion segmentation	80			80
	SC segmentation			80
Vancouver	Lesion segmentation	78			78
	SC segmentation			78
Total		311	82	393	393

Click here to see an interactive image of a manual lesion segmentation in an STIR image.

Methods #1: Splitting of the data for training with cross-validation (CV)

In order to reduce the overfitting and build a more robust model, an automatic cross validation process was applied.

Methods #2: Pre-processing Pipeline for training and testing

The following preprocessing pipeline was applied to our entire CanProCo database and to the MP2RAGE database. It is necessary to apply the same preprocessing pipeline to test the model on external datas.

Methods #3: Conversion from BIDS formalism to nnUnet formalism

The nnUnet needs for its training and testing a particular data structure that we will call "nnUnet formalism", In this Ivadomed repository a script has been developed to convert data from the BIDS formalism to the nnUnet formalism. This step is crucial for both training and testing steps.

Commands to use

Convert from BIDS to nnUnet:

python convert_bids_to_nnUnetv2.py --path-data BIDS_RPI_STIR_SPIR/ --path-out nnUNet_raw --dataset-name ms_lesion_PSIR_STIR --label-suffix lesion-manual --dataset-number 520 --contrasts PSIR STIR --seed 99 --split 0.8 0.2 --labels-path-name BIDS_RPI_STIR_SPIR/derivatives/labels/ --session-name ses-M0

Launch preprocessing for training:

nnUNetv2_plan_and_preprocess -d 520 --verify_dataset_integrity

Launch parallel training :

CUDA_VISIBLE_DEVICES=6 nnUNetv2_train 520 2d 0 --npz

CUDA_VISIBLE_DEVICES=7 nnUNetv2_train 520 2d 1 --npz

CUDA_VISIBLE_DEVICES=3 nnUNetv2_train 520 2d 2 --npz

CUDA_VISIBLE_DEVICES=4 nnUNetv2_train 520 2d 3 --npz

CUDA_VISIBLE_DEVICES=5 nnUNetv2_train 520 2d 4 --npz

Find the best configuration for testing :

nnUNetv2_find_best_configuration 520 -c 2d

Output (to test):

nnUNetv2_apply_postprocessing -i brainhack/ensembling_STIR_PSIR -o brainhack/ensembling_STIR_PSIR_proba -pp_pkl_file nnUNet_results/Dataset520_ms_lesion_PSIR_STIR/nnUNetTrainer__nnUNetPlans__2d/crossval_results_folds_0_1_2_3_4/postprocessing.pkl -np 8 -plans_json nnUNet_results/Dataset520_ms_lesion_PSIR_STIR/nnUNetTrainer__nnUNetPlans__2d/crossval_results_folds_0_1_2_3_4/plans.json

Now let's move to our external database (CRMBM, Marseille):

Convert from BIDS to nnUnet:

python convert_bids_to_nnUnetv2.py --path-data bids_mp2rage/ --path-out nnUNet_raw --dataset-name ms_lesion_T1q_UNI --label-suffix lesion-manual --dataset-number 524 --contrasts T1q UNI --seed 99 --split 0.5 0.5 --labels-path-name bids_mp2rage/derivatives/labels/ --session-name ses-M0

To test in external DB

nnUNetv2_predict -d Dataset520_ms_lesion_PSIR_STIR -i nnUNet_raw/Dataset522_ms_lesion_T1q_UNI/imagesTs -o brainhack/test_T1q_UNI_proba -f 0 1 2 3 4 -tr nnUNetTrainer -c 2d -p nnUNetPlans

Results #1: Curves of training

Five models (each fold) were trained on different GPU cards for approximately 45 hours (1000 epochs). and the following training curves were obtained, where a convergence of the pseudo Dice around 0.5 is observed, however we have a model that has started to suffer an overfitting (fold 2) where as the pseudo Dice falls, the loss validation increases.

Results #2: Test of model in CanProCo dataset

In the following distribution of Dice values we observe an irregular distribution, none of them exceeds 0.8 and we have masks incompatible with the GT, obtaining 0, likewise, there are empty manual masks in the input images, as well as in the inferences. The boxplots show that the STIR images have a higher resolution than the PSIR images. Click here to see an interactive image of a automatic lesion segmentation by nnUnet in an PSIR image.

Results #3: Test of model in MP2RAGE dataset

In the following distribution of Dice values we observe an irregular distribution, none of them exceeds 0.78. None of the UNI images could be segmented, however the T1q images were segmented, only with 8 empty masks.

Click here to see an interactive image of a automatic lesion segmentation by nnUnet in an 3T MP2RAGE image.

Click here to see an interactive image of a automatic lesion segmentation by nnUnet in an 7T MP2RAGE image.

Examples of automatic segmentation

Here is an example of the automatic lesion segmentations (green) from our model on images in the CanProCo database (test split) and in the external database (CRMBM, Marseille).

Deliverables

Preprocessing pipeline for training/testing nnU-net in MS lesions
Preliminary results of the MS lesion segmentation in the SC using 2D nnUnet.
Jupyter notebooks for data analysis

Conclusion

The segmentation of MS lesions in the spinal cord is a challenge: there is no segmentation model that works "right" in MS lesions it is linked to inter-rater bias, small volume and irregularity of lesions.
First approach (2D nnUnet) trained on STIR/PSIR and a replicability test on MP2RAGE data.
Automatic deep learning lesion segmentation models is a work in progress.

Perspectives

Train a 3D nnUnet models
Data augmentation methods (GAN; Diffusion models)
Transfer learning of our model to other trainings
Ensemble multiple predictions approaches (e.g. Seg MS MP2RAGE model from Basel)

Acknowledgements

I would like to thank the organizers of the Brainhack School 2023 where I consolidated and formalized a lot of knowledge and tools that will be useful for my doctoral project.

Thanks also to NeuroPoly Team and CRMBM Team for their hard work in the acquisition, processing and manual segmentation of MS patients.

Special thanks to TA Jan and Andjela for their guidance and support.

References

A. J. Thompson et al., ‘Diagnosis of multiple sclerosis: 2017 revisions of the McDonald criteria’, Lancet Neurol., vol. 17, no. 2, pp. 162–173, Feb. 2018, doi: 10.1016/S1474-4422(17)30470-2.
B. De Leener et al., ‘SCT: Spinal Cord Toolbox, an open-source software for processing spinal cord MRI data’, NeuroImage, vol. 145, pp. 24–43, Jan. 2017, doi: 10.1016/j.neuroimage.2016.10.009.
C. Gros et al., ‘Automatic segmentation of the spinal cord and intramedullary multiple sclerosis lesions with convolutional neural networks’, NeuroImage, vol. 184, pp. 901–915, Jan. 2019, doi: 10.1016/j.neuroimage.2018.09.081.
Isensee, F., Jaeger, P. F., Kohl, S. A. A., Petersen, J., & Maier-Hein, K. H. (2021). nnU-Net: A self-configuring method for deep learning-based biomedical image segmentation. Nature Methods, 18(2), https://doi.org/10.1038/s41592-020-01008-z
O. Vincent, C. Gros, J. P. Cohen, and J. Cohen-Adad, ‘Automatic segmentation of spinal multiple sclerosis lesions: How to generalize across MRI contrasts?’, ArXiv200304377 Cs Eess, Jun. 2020, Accessed: May 25, 2022. [Online]. Available: http://arxiv.org/abs/2003.04377
H. Lassmann, ‘Multiple Sclerosis Pathology’, Cold Spring Harb. Perspect. Med., vol. 8, no. 3, Mar. 2018, doi: 10.1101/cshperspect.a028936.
O. Ronneberger, P. Fischer, and T. Brox, ‘U-Net: Convolutional Networks for Biomedical Image Segmentation’. arXiv, May 18, 2015. doi: 10.48550/arXiv.1505.04597.
Ö. Çiçek, A. Abdulkadir, S. S. Lienkamp, T. Brox, and O. Ronneberger, ‘3D U-Net: Learning Dense Volumetric Segmentation from Sparse Annotation’. arXiv, Jun. 21, 2016. Accessed: Jun. 30, 2022. [Online]. Available: http://arxiv.org/abs/1606.06650
Oh, J., Arbour, N., Giuliani, F., Guenette, M., Kolind, S., Lynd, L., Marrie, R. A., Metz, L. M., Patten, S. B., Prat, A., Schabas, A., Smyth, P., Tam, R., Traboulsee, A., & Yong, V. W. (2021). The Canadian prospective cohort study to understand progression in multiple sclerosis (CanProCo): Rationale, aims, and study design. BMC Neurology, 21(1), 418. https://doi.org/10.1186/s12883-021-02447-7
Sastre-Garriga, J., Pareto, D., Alberich, M., Rodríguez-Acevedo, B., Vidal-Jordana, À., Corral, J. F., Tintoré, M., Río, J., Auger, C., Montalban, X., & Rovira, À. (2022). Spinal cord grey matter atrophy in Multiple Sclerosis clinical practice. Neuroscience Informatics, 2(2), 100071. https://doi.org/10.1016/j.neuri.2022.100071

brainhack-school2023/laines_project

Automatic MS lesions segmentation in the spinal cord with CanProCo dataset

About me

Project Summary

Introduction

Main Objectives

Tools & Methods

Data

CanProCo MS Dataset Description (N=393)

Methods #1: Splitting of the data for training with cross-validation (CV)

Methods #2: Pre-processing Pipeline for training and testing

Methods #3: Conversion from BIDS formalism to nnUnet formalism

Commands to use

Convert from BIDS to nnUnet:

Launch preprocessing for training:

Launch parallel training :

Find the best configuration for testing :

Output (to test):

Now let's move to our external database (CRMBM, Marseille):

Convert from BIDS to nnUnet:

To test in external DB

Results #1: Curves of training

Results #2: Test of model in CanProCo dataset

Results #3: Test of model in MP2RAGE dataset

Examples of automatic segmentation

Deliverables

Conclusion

Perspectives

Acknowledgements

References