I tried to use your preprocessing methods and pretrained model but it didn't work. May I check with you?
Closed this issue · 1 comments
Hello,
I tried to use your preprocessing methods and pretrained model but it didn't work on my dataset. May I check with you about 3 questions?
(1) I used DCMTK with: dcmj2pnm +on2 --min-max-window --set-window -600 1500 pathToDCM pathToPNG16
Did I use DCMTK with the same command line as yours?
(2) I studied augmentations.py and followed its methods to change group of PNG16 into tensor
([mean std]=[128.1722, 87.1849] for normalization), (TorchIO to do interpolation, but I changed the voxel to 1.5 * 1.5 * 1.5 mm), (The tensor [min, max] was about [-1.4701, 1.4547]) (The tensor input into the model had shape (C T H W))
The subplots in this image are some slices from the interpolated tensor (plot by plt.imshow(1SliceOfTensor, cmap='gray'))
Did I changed the PNG16 into tensor in the correct way?
Do you think it's a bad idea to change its voxel to 1.5 * 1.5 * 1.5 mm (you use 1.4 * 1.4 * 2.5 mm)?
(3) I load your pretrained model's encoder back into standard r3d_18 and replace its last fc layer so that it can train on my 5 classes dataset.
resnet3d = torchvision.models.video.r3d_18(pretrained=True)
path = "/path/to/65fd1f04cb4c5847d86a9ed8ba31ac1aepoch=10.ckpt"
checkpoint = torch.load(path, map_location="cpu")
(the layers' names in your pretrained model are different from the standard r3d_18 so that I change them back)
state_dict = {("layer"+k[20:]): v for k, v in checkpoint["state_dict"].items()}
state_dict["stem.0.weight"] = state_dict.pop("layer0.0.weight")
state_dict["stem.1.weight"] = state_dict.pop("layer0.1.weight")
state_dict["stem.1.bias"] = state_dict.pop("layer0.1.bias")
state_dict["stem.1.running_mean"] = state_dict.pop("layer0.1.running_mean")
state_dict["stem.1.running_var"] = state_dict.pop("layer0.1.running_var")
state_dict["stem.1.num_batches_tracked"] = state_dict.pop("layer0.1.num_batches_tracked")
model_dict_copy = resnet3d.state_dict()
pretrained_dict = {k: v for k, v in state_dict.items() if k in model_dict_copy}
model_dict_copy.update(pretrained_dict)
resnet3d.load_state_dict(model_dict_copy)
Your pretrained model has other layers after the encoder but I am not sure whether I should use them.
Do you think I use your pretrained model correctly?
I freezed the encoder and trained only the fc layer because my dataset is small. But the training accuracy stay very low (40% for 5 classes classification).
Do you think I made any mistakes?
Thank you very much.
Hi,
Thanks for reaching out!
- I think that's the same as the DCMTK conversion command we use, which is:
dcmj2pnm +on2 +Ww -600 1500 dicom_path image_path
-
You've got the right tensor dimensions, so that should be good. In terms of the voxel shape, I might keep the original parameters if you intend to use the pretrained weights, but definitely feel free to experiment as they can be context-dependent.
-
In terms of the model, it may be more straightforward to initialize the SybilNet class (see for instance the
load_model
function in model.py). The additional layers perform pooling on the output activations from the 3D ResNet, and learn an attention. I would use the last hidden layer from that (seepool_output["hidden"]
in SybilNet.
Then your model could look something like:
class NewModel(nn.Module):
def __init__(self, checkpoint_path):
super(NewModel, self).__init__()
checkpoint = torch.load(checkpoint_path, map_location="cpu")
args = checkpoint["args"]
model = SybilNet(args)
# Remove model from param names
state_dict = {k[6:]: v for k, v in checkpoint["state_dict"].items()}
model.load_state_dict(state_dict) # type: ignore
self.sybil_model = model
# some model architecture you wish to train
self.classifier = nn.Linear(512, 5)
def forward(self, x):
output = self.sybil_model(x)
hidden = output["hidden"]
y_hat = self.classifier(hidden)
return y_hat
I don't see any major issues otherwise -- I would note that training dynamics can be volatile depending on the hyper-parameters.