RuntimeError: size mismatch, m1, m2
Opened this issue · 5 comments
Hi, I have encountered an error when training.
I am trying to train the model using DIV2K dataset, DIV2K_train_HR
, DIV2K_train_LR_bicubic/X4
.
After python create_dataset.py
, I successfully generated the data. According to the LRHR_dataset.py
, I put data in the right place. When I start to train, it downloaded the pretrained model, and I got an error like this:
LogHandlers setup!
21-06-15 20:41:57.700 : ===================== Selected training parameters =====================
21-06-15 20:41:57.701 : Namespace(D_init_iters=0, D_update_ratio=1, alpha=1.2, amsgrad=False, beta1_D=0.9, beta1_G=0.9, beta2_D=0.999, beta2_G=0.999, cuda=True, eps_D=1e-08, eps_G=1e-08, feature_criterion='l1', feature_weight=1.0, gan_type='ragan', gan_weight=1.0, imdbTestPath='./datasets/', imdbTrainPath='./datasets/', in_nc=3, is_mixup=True, is_train=True, lr_D=0.0001, lr_G=0.0001, lr_gamma=0.5, lr_milestones=[5000, 10000, 20000, 30000], lr_restart=None, lr_restart_weights=None, nf=64, niter=51000, numWorkers=4, patch_size=40, pixel_criterion='l1', pixel_weight=10.0, pretrain=True, pretrainedModelPath='pretrained_nets/SRResDNet/G_perceptual.pth', resdnet_depth=5, resume=True, resume_start_epoch=0, rgb_range=255, saveBest=True, saveImgsPath='results', saveLogsPath='logs', saveTrainedModelsPath='trained_nets', save_checkpoint_freq=20, save_path_best_lpips='/best_lpips/', save_path_best_psnr='/best_psnr/', save_path_netD='/netD/', save_path_netG='/netG/', save_path_training_states='/training_states/', seed=123, testBatchSize=1, test_stdn=[0.0], trainBatchSize=16, train_stdn=[0.0], tv_criterion='l1', tv_weight=1.0, upscale_factor=4, use_bn=False, use_chop=False, use_filters=True, warmup_iter=-1, weightdecay_D=0, weightdecay_G=0).
21-06-15 20:41:57.701 : ===================== Loading dataset =====================
21-06-15 20:41:57.706 : training dataset: 2400
21-06-15 20:41:57.706 : training loaders: 150
21-06-15 20:41:57.707 : testing dataset: 100
21-06-15 20:41:57.707 : testing loaders: 100
21-06-15 20:41:57.707 : ===================== Building model =====================
21-06-15 20:41:57.803 : Initialized model with pretrained net from pretrained_nets/SRResDNet/G_perceptual.pth.
Setting up Perceptual loss...
Loading model from: /home/xuwh/RJPcode/SRResCGAN-master/training_codes/modules/weights/v0.1/alex.pth
...[net-lin [alex]] initialized
...Done
21-06-15 20:42:01.452 : Network G structure: SRResDNet, with parameters: 380,356
21-06-15 20:42:01.452 : SRResDNet(
(model): ResDNet(
(conv1): Conv2d(3, 64, kernel_size=(5, 5), stride=(1, 1))
(layer1): Sequential(
(0): BasicBlock(
(conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1))
(relu1): PReLU(num_parameters=64)
(relu2): PReLU(num_parameters=64)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1))
)
(1): BasicBlock(
(conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1))
(relu1): PReLU(num_parameters=64)
(relu2): PReLU(num_parameters=64)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1))
)
(2): BasicBlock(
(conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1))
(relu1): PReLU(num_parameters=64)
(relu2): PReLU(num_parameters=64)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1))
)
(3): BasicBlock(
(conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1))
(relu1): PReLU(num_parameters=64)
(relu2): PReLU(num_parameters=64)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1))
)
(4): BasicBlock(
(conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1))
(relu1): PReLU(num_parameters=64)
(relu2): PReLU(num_parameters=64)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1))
)
)
(conv_out): ConvTranspose2d(64, 3, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
(l2proj): L2Proj()
)
(noise_estimator): Wmad_estimator()
(bbproj): Hardtanh(min_val=0.0, max_val=255.0)
)
21-06-15 20:42:01.453 : Network D structure: Discriminator_VGG_128, with parameters: 14,499,401
21-06-15 20:42:01.453 : Discriminator_VGG_128(
(conv0_0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(conv0_1): Conv2d(64, 64, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(bn0_1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv1_0): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1_0): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv1_1): Conv2d(128, 128, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(bn1_1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2_0): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2_0): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2_1): Conv2d(256, 256, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(bn2_1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3_0): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn3_0): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3_1): Conv2d(512, 512, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(bn3_1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv4_0): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn4_0): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv4_1): Conv2d(512, 512, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(bn4_1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(linear1): Linear(in_features=8192, out_features=100, bias=True)
(linear2): Linear(in_features=100, out_features=1, bias=True)
(lrelu): LeakyReLU(negative_slope=0.2, inplace=True)
)
21-06-15 20:42:01.453 : Network F structure: VGGFeatureExtractor, with parameters: 20,024,384
21-06-15 20:42:01.453 : VGGFeatureExtractor(
(features): Sequential(
(0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): ReLU(inplace=True)
(2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(3): ReLU(inplace=True)
(4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(6): ReLU(inplace=True)
(7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(8): ReLU(inplace=True)
(9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(11): ReLU(inplace=True)
(12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(13): ReLU(inplace=True)
(14): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(15): ReLU(inplace=True)
(16): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(17): ReLU(inplace=True)
(18): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(19): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(20): ReLU(inplace=True)
(21): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(22): ReLU(inplace=True)
(23): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(24): ReLU(inplace=True)
(25): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(26): ReLU(inplace=True)
(27): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(28): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(29): ReLU(inplace=True)
(30): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(31): ReLU(inplace=True)
(32): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(33): ReLU(inplace=True)
(34): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
)
)
21-06-15 20:42:01.454 : ===================== start training =====================
21-06-15 20:42:01.454 : ===================== resume training =====================
21-06-15 20:42:01.454 : ===> No saved training states to resume.
21-06-15 20:42:01.454 : ===> start training from epoch: 0, iter: 0.
21-06-15 20:42:01.454 : Total # of epochs for training: 340.
21-06-15 20:42:01.454 : ===> train:: Epoch[1]
21-06-15 20:42:03.040 : ===> train:: Epoch[1] Iter-step[1]
Traceback (most recent call last):
File "main_sr_color.py", line 1057, in <module>
main()
File "main_sr_color.py", line 964, in main
current_step)
File "main_sr_color.py", line 418, in train
pred_g_fake = netD(filter_high(fake_H))
File "/home/xuwh/anaconda3/envs/srrescgan/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "/home/xuwh/RJPcode/SRResCGAN-master/training_codes/models/discriminator_vgg_arch.py", line 57, in forward
fea = self.lrelu(self.linear1(fea))
File "/home/xuwh/anaconda3/envs/srrescgan/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "/home/xuwh/anaconda3/envs/srrescgan/lib/python3.6/site-packages/torch/nn/modules/linear.py", line 87, in forward
return F.linear(input, self.weight, self.bias)
File "/home/xuwh/anaconda3/envs/srrescgan/lib/python3.6/site-packages/torch/nn/functional.py", line 1370, in linear
ret = torch.addmm(bias, input, weight.t())
RuntimeError: size mismatch, m1: [16 x 12800], m2: [8192 x 100] at /opt/conda/conda-bld/pytorch_1579027003190/work/aten/src/THC/generic/THCTensorMathBlas.cu:290
I understand that m1 [a * b], m2: [c * d]
where b=c
. But how does this error occur even if I leave the source code same?
Is that the problems of hpyer-parameter?
I am new in deep learning. So really confused why does it happen and wonder how can I fix it?
Thanks in advance.
Hi, you set the patch_size=40
, but the discriminator network netD
takes the fake_HR
of size 128x128x3
, while your SR output has fake_HR
size of 160x160x3
.
Can you set the patch_size=32
(default setting)? I hope this issue will not occur.
Thanks.
Oh! Thanks!
It works when I set patch_size=32
.
But it's really weird that in the main_sr_color.py
parser.add_argument('--patch_size', type = int, default = 32, help='patch size for training. [x2-->60,x3-->50,x4-->40]')
Line 65 or so which suggests that you should set patch_size=40
when upscale=4
Whatever, it starts training. Really appreciate your help, sir.
Another question about the test image database.
I notice in LRHR_dataset.py
line 39-40:
self.dataroot_hr = dataroot+'DF2K/valid/clean/'
self.dataroot_lr = dataroot+'DF2K/valid/corrupted/'
Does it mean that I should put the HR data generated from create_dataset.py
to DF2K/valid/clean
and the produced LR noisy data to DF2K/valid/corrupted
or
I should put origin non-noisy LR image of DIV2K(the dataset I use) to DF2K/valid/clean
and LR noisy data to DF2K/valid/corrupted
.
Thanks a lot.
Thank you for indicating the issues, I updated the comments in the main_sr.py
.
For the valid
dataset, you can use HR data generated from the script create_dataset.py
to DF2K/valid/clean
and the produced LR noisy data to DF2K/valid/corrupted
.
For the Ntire Challenge task, they provide the validation set, where the clean images are the HR images, and corrupted images are the LR images. You can use this validation set to reproduce the challenge results.
Thanks.
Thanks for responding!!!
Read the modified main_sr.py
, I saw [x2-->64,x3-->42,x4-->32]
.
Another question just pump into my head. If I want upscale=5
, and what patch_size
should be? For instance, If I get a pair of images with low resolution of 100*100
and high resolution of 500 * 500
.
First try, I can upscale 100 * 100
to 125*125
, then train the model with upscale=4
.
Second option, I just set upscale=5
, but does the model support upscale=5
?
If the model does support, what is appropriate patch_size
should be set?
Or how can I calculate the corresponding patch_size
according to the upscale
?
Is that just a simple math problem cause I see the upscale
of x4
is half of x2
or should I read from model structure?
Sorry for bothering you with such questions a newbie will raise LOL.
Thanks in advance.
Hi, Thank you for your questions.
Currently, the patch size depends upon the discriminator net settings because it takes the patch of 128x128
. So, the output of SR generator net would be 128x128
. If you change the discriminator's settings to take arbitrary patch size, the upscale factor can be chosen arbitrary.
For the existing settings, for upscale=5
, the patch size will be 26 i.e. patch_size/upscale
.
Thanks.