StanfordMIMI/DDM2

Configurations for executing code with the PPMI dataset

Closed this issue · 2 comments

Hi!
I'm trying to run Stage1 with the PPMI dataset but I ge the error:

Traceback (most recent call last): File "/content/DDM2_test/train_noise_model.py", line 98, in <module> trainer.optimize_parameters() File "/content/DDM2_test/model/model_stage1.py", line 69, in optimize_parameters outputs = self.netG(self.data) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/content/DDM2_test/model/mri_modules/noise_model.py", line 44, in forward return self.p_losses(x, *args, **kwargs) File "/content/DDM2_test/model/mri_modules/noise_model.py", line 36, in p_losses x_recon = self.denoise_fn(x_in['condition']) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/content/DDM2_test/model/mri_modules/unet.py", line 286, in forward x = layer(x) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/conv.py", line 460, in forward return self._conv_forward(input, self.weight, self.bias) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/conv.py", line 456, in _conv_forward return F.conv2d(input, weight, bias, self.stride, RuntimeError: Input type (double) and bias type (float) should be the same

It's the same dataset that you use in your article.
I have added a new config file as follows:

 {
    "name": "ppmi64",
    "phase": "train", // always set to train in the config
    "gpu_ids": [
        0
    ],
    "path": { //set the path
        "log": "logs",
        "tb_logger": "tb_logger",
        "results": "results",
        "checkpoint": "checkpoint",
        "resume_state": null // UPDATE THIS FOR RESUMING TRAINING
    },
    "datasets": {
        "train": {
            "name": "ppmi",
            "dataroot": "/PPMI/noisy.nii.gz",
            "valid_mask": [10,64],
            "phase": "train",
            "padding": 3,
            "val_volume_idx": 40, // the volume to visualize for validation
            "val_slice_idx": 40, // the slice to visualize for validation
            "batch_size": 32,
            "in_channel": 1,
            "num_workers": 0,
            "use_shuffle": true
        },
        "val": {
            "name": "ppmi",
            "dataroot": "/PPMI/noisy.nii.gz",
            "valid_mask": [10,64],
            "phase": "val",
            "padding": 3,
            "val_volume_idx": 40, // the volume to visualize for validation
            "val_slice_idx": 40, // the slice to visualize for validation
            "batch_size": 1,
            "in_channel": 1,
            "num_workers": 0
        }
    },
    "model": {
        "which_model_G": "mri",
        "finetune_norm": false,
        "drop_rate": 0.0,
        "unet": {
            "in_channel": 1,
            "out_channel": 1,
            "inner_channel": 32,
            "norm_groups": 32,
            "channel_multiplier": [
                1,
                2,
                4,
                8,
                8
            ],
            "attn_res": [
                16
            ],
            "res_blocks": 2,
            "dropout": 0.0,
            "version": "v1"
        },
        "beta_schedule": { // use munual beta_schedule for acceleration
            "train": {
                "schedule": "rev_warmup70",
                "n_timestep": 1000,
                "linear_start": 5e-5,
                "linear_end": 1e-2
            },
            "val": {
                "schedule": "rev_warmup70",
                "n_timestep": 1000,
                "linear_start": 5e-5,
                "linear_end": 1e-2
            }
        },
        "diffusion": {
            "image_size": 128,
            "channels": 3, //sample channel
            "conditional": true // not used for DDM2
        }
    },
    "train": {
        "n_iter": 100000, //150000,
        "val_freq": 1e3,
        "save_checkpoint_freq": 1e4,
        "print_freq": 1e2,
        "optimizer": {
            "type": "adam",
            "lr": 1e-4
        },
        "ema_scheduler": { // not used now
            "step_start_ema": 5000,
            "update_ema_every": 1,
            "ema_decay": 0.9999
        }
    },
    // for Phase1
    "noise_model": {
        "resume_state": null,
        "drop_rate": 0.0,
        "unet": {
            "in_channel": 2,
            "out_channel": 1,
            "inner_channel": 32,
            "norm_groups": 32,
            "channel_multiplier": [
                1,
                2,
                4,
                8,
                8
            ],
            "attn_res": [
                16
            ],
            "res_blocks": 2,
            "dropout": 0.0,
            "version": "v1"
        },
        "beta_schedule": { // use munual beta_schedule for accelerationß
            "linear_start": 5e-5,
            "linear_end": 1e-2
        },
        "n_iter": 10000,
        "val_freq": 2e3,
        "save_checkpoint_freq": 1e4,
        "print_freq": 1e3,
        "optimizer": {
            "type": "adam",
            "lr": 1e-4
        }
    },
    "stage2_file": "" // **UPDATE THIS TO THE PATH OF PHASE2 MATCHED FILE** 
}

Do we need to change anything else in the config file?
Can you share the configuration files you have used for the other datasets?

Hi, sorry for the bug. It is due to data type mismatch between MRI tensor and convolution kernel. I have just updated the code, please pull the latest code and try again :)
Your configuration for PPMI looks good to me, it should work for Stage1. Remember to update the configuration after each stage, as described in README.

It's working now!
Thanks!