I tried to use your preprocessing methods and pretrained model but it didn't work. May I check with you?

Question

I tried to use your preprocessing methods and pretrained model but it didn't work. May I check with you?

Closed this issue 8 months ago · 1 comments

Hello,
I tried to use your preprocessing methods and pretrained model but it didn't work on my dataset. May I check with you about 3 questions?

(1) I used DCMTK with: dcmj2pnm +on2 --min-max-window --set-window -600 1500 pathToDCM pathToPNG16
Did I use DCMTK with the same command line as yours?

(2) I studied augmentations.py and followed its methods to change group of PNG16 into tensor
([mean std]=[128.1722, 87.1849] for normalization), (TorchIO to do interpolation, but I changed the voxel to 1.5 * 1.5 * 1.5 mm), (The tensor [min, max] was about [-1.4701, 1.4547]) (The tensor input into the model had shape (C T H W))

The subplots in this image are some slices from the interpolated tensor (plot by plt.imshow(1SliceOfTensor, cmap='gray'))

Did I changed the PNG16 into tensor in the correct way?
Do you think it's a bad idea to change its voxel to 1.5 * 1.5 * 1.5 mm (you use 1.4 * 1.4 * 2.5 mm)?

(3) I load your pretrained model's encoder back into standard r3d_18 and replace its last fc layer so that it can train on my 5 classes dataset.

resnet3d = torchvision.models.video.r3d_18(pretrained=True)
path = "/path/to/65fd1f04cb4c5847d86a9ed8ba31ac1aepoch=10.ckpt"
checkpoint = torch.load(path, map_location="cpu")
(the layers' names in your pretrained model are different from the standard r3d_18 so that I change them back)
state_dict = {("layer"+k[20:]): v for k, v in checkpoint["state_dict"].items()}
state_dict["stem.0.weight"] = state_dict.pop("layer0.0.weight")
state_dict["stem.1.weight"] = state_dict.pop("layer0.1.weight")
state_dict["stem.1.bias"] = state_dict.pop("layer0.1.bias")
state_dict["stem.1.running_mean"] = state_dict.pop("layer0.1.running_mean")
state_dict["stem.1.running_var"] = state_dict.pop("layer0.1.running_var")
state_dict["stem.1.num_batches_tracked"] = state_dict.pop("layer0.1.num_batches_tracked")
model_dict_copy = resnet3d.state_dict()
pretrained_dict = {k: v for k, v in state_dict.items() if k in model_dict_copy}
model_dict_copy.update(pretrained_dict)
resnet3d.load_state_dict(model_dict_copy)

Your pretrained model has other layers after the encoder but I am not sure whether I should use them.
Do you think I use your pretrained model correctly?

I freezed the encoder and trained only the fc layer because my dataset is small. But the training accuracy stay very low (40% for 5 classes classification).
Do you think I made any mistakes?
Thank you very much.

Answer 1 · 2023-12-10T07:20:31.000Z

Hi,

Thanks for reaching out!

I think that's the same as the DCMTK conversion command we use, which is:

dcmj2pnm +on2 +Ww -600 1500  dicom_path  image_path

You've got the right tensor dimensions, so that should be good. In terms of the voxel shape, I might keep the original parameters if you intend to use the pretrained weights, but definitely feel free to experiment as they can be context-dependent.
In terms of the model, it may be more straightforward to initialize the SybilNet class (see for instance the load_model function in model.py). The additional layers perform pooling on the output activations from the 3D ResNet, and learn an attention. I would use the last hidden layer from that (see pool_output["hidden"] in SybilNet.

Then your model could look something like:

class NewModel(nn.Module):
    def __init__(self, checkpoint_path):
        super(NewModel, self).__init__()
  
        checkpoint = torch.load(checkpoint_path, map_location="cpu")
        args = checkpoint["args"]
        model = SybilNet(args)

        # Remove model from param names
        state_dict = {k[6:]: v for k, v in checkpoint["state_dict"].items()}
        model.load_state_dict(state_dict)  # type: ignore
        
        self.sybil_model = model 
        # some model architecture you wish to train
        self.classifier = nn.Linear(512, 5) 

    def forward(self, x):
        output = self.sybil_model(x)
        hidden = output["hidden"]
        y_hat = self.classifier(hidden)
        return y_hat

I don't see any major issues otherwise -- I would note that training dynamics can be volatile depending on the hyper-parameters.