duyhominhnguyen/LVM-Med

Performing 2d model on 3d input

Closed this issue · 8 comments

Hi,

I want to try the ResNet50 backbone for fine-tuned performance on Lung CT images; but I get error for GPU memory error. I set the batch size 1 while performing gradient accumulation for 8 steps to mimic the behaviour of batch size 8. However, still I get the GPU memory error. Can you help me for how I should change the code if I want to use 2d model for 3d input. Becasue in your paper, you say you get the slices of the 3d model and then you merge the result of the segmentation of 2d slices. Also how should I change the data yml file for this, because for the MMWHS_CT_endtoend_R50.yml file you only get the masks and images if there is substructure of 2 in cardiac then you fine-tune the model for 8 class. I am really confused in this. For my case, I want to perform a test how your model perform on 3d ct so I get all the slices if there is a mask then I convert them into binary masks. I really appreciate if you can help me.

Best regards,

Hi. Firstly, I would like to thank for your consideration toward our work.

  • On the GPU memory error problem: herein, we stack all 2D slices into 1 joint 3D volumes. Hence, even though you set batch size = 1, the model still receives the input shape of (number of 3D volume in 1 batch, total 2D slices in that 3D volume, color channels, h, w) leading to the insufficient GPU memory problem since each 3D volume usually has around 100 - 300 2D slices. With this problem, you could instead treat 2D slices within a certain 3D volume individually and use 2D segmentation as usual. Or you could try to run with CPU instead of GPU.
  • On the data yml file, you could adjust this file freely to suit your need. Yet on your concern, I'm still not sure what the problem really is. Could you demonstrate it more detailedly for me ?

Hope it might help, feel free to ask if there is any further concern.

Hi,

Thank you for your answer. I just want to test your work on the mediastinal lymph node segmentation on lung CT images. As I examine the codes for MMWHS_CT data, you get the slices with labels that include substructure 2; then these masks are converted into binary masks; but the class number is given as 8 (Since the focus is segmenting substruce of 2 by taking relative masks why the class number is set for 8; I don't understand.) For my case, I change this section by taking all slices with the labels that has at least one mask and then convert them into binary mask; then I give the classes 2. I think these changes is sufficient to try the fine-tuning code.

Best regards,

Hi, I would like to thank for your consideration toward our work.

  • In the code for MMWHS_CT data, we can still set the numclass = 8 (numclass = 2 is better and more accurate) because the smp library train the model and output the probability of each pixel in the image among numclass = 8 (numclass = 2) classes, and the masks are also converted into one-hot representation with 8 classes. In the evaluation stage, we define eval_class we want to segment. Then, we choose the class with highest probability and check whether the predicted class of this pixel is right or wrong corresponding to the class index we have defined before (eval_class = 2). You can check the code def evaluate(net, dataloader, device, eval_class) on evaluate.py. However, we admit that we should set numclass = 2 instead of numclass = 8 because this is an 3D binary segmentation (with 2D tasks, we set numclass = 2).

  • If you use this code for binary segmentation, you can change the numclass = 2 in code train and evaluate. Then, you set the eval_class you want to segment in the image (with binary in 2D tasks, we set eval_class = 1) and with masks before training and evaluating, you change the value of segmented pixel to eval_class. You can check the code dataloader
    /dataset_ete.py to understand how we define eval_class to train the model.

Hope it might help, feel free to ask if there is any further concern.

Hi,

Thank you for the clarification, I already changed the parameters as you mentioned. Now, I can test your study in the mediastinal lymph node segmentation by making binary segmentation. But the IoU 3d score comes out very low. Actually I perform binary segmentation because I want to test the ResNet50 model performance on segmentation of the lymph nodes without specifying classes/anatomical stations. Thank you for your answers.

Best regards,

However, I am still confused for the MMWHS_ct data. There are a total of 8 classes in the data, including 7 substructures and 1 other, but in code,only the 2nd substructure is focused to be segmentated. Because if I understand correctly, only the masks that contain substructure 2 were taken and converted into binary masks. However, there are also other 6 substructure masks. I am really confused in this part. Thank you for patience and answers.

Best regards,

Hi @denemeGit11, for MMWHS_ct data, we trained to segment all classes but only evaluate the 2nd structure as you mentioned. This aims to compare results with previous baselines on left atrial blood cavity regions when they only report results on this one. In other words, you can use our codes to evaluate other structures you want by changing eval_class variable.

Regarding the low performance on the mediastinal lymph node, have you solved it? You should try around with some tricks like normalizing gradient or adjust the learning rate (lr=0.001, lr=0.01, etc) of Adam for your case. Also make sure to normalize your input data into a proper range as well. Please let us know if it still does not work.

Hi,

Thanks for your answers. I can inform you when the fine-tuning code is completed. I changed some parameters and add more data, and now the 2d iou score is actually okay like around 0.85 but 3d iou score comes out pretty low around 0.13.

@denemeGit11, then you should check on the 3D evaluation function to make sure you pass the right label number to measure accuracy.