Trying to reproduce your results
uyekt opened this issue · 11 comments
Hi there,
Thank you very much for releasing this code!
I'm trying to reproduce your results. However, I guess I'm missing something...
Training config is:
{
"batch_size": 64,
"ckpt_dir": "checkpoint/carn_baseline",
"ckpt_name": "carn_baseline",
"clip": 10.0,
"decay": 400000,
"group": 1,
"loss_fn": "L1",
"lr": 0.0001,
"max_steps": 600000,
"model": "carn",
"num_gpu": 1,
"patch_size": 64,
"print_interval": 1000,
"sample_dir": "sample/",
"scale": 0,
"shave": 20,
"train_data_path": "dataset/DIV2K_train.h5",
"verbose": true
}
# of params: 1591963
On DIV2K Bicubic. Did you use bicubic Downscaling or unknown dowgrading operators?
After 575k Iteration on one single Titan X, I could only achieve following results on Urban100:
- x2: 30.31
- x3: 26.52
- x4: 24.57
which is kind of far from the paper results :-(
Is it just bad luck with the initialization or do I miss something important?
Btw, I noticed that I can fit Batch 64 / Patch 64 on one single Titan X. When I use two, the second one loads only about 600Mb Memory. Is that a normal behavior?
Thanks a lot for your help!
Hi.
Did you run the benchmark on the python? Normally, it has to be calculated on the Matlab.
First, I guess in this code, evaluation is performed using RGB channels, but it's common to calculate PSNR just using Y (luminance) channel in SR community.
Second, Not sure why, but converting RGB to Y channel results different compared to the python and Matlab.
So, if your result scores are calculated during training, please see it just a "validation" score.
Oh thanks, I missed the comment in the training code.
I was able to almost reproduce your results converting RGB to Y.
Do you crop any border-pixel on Y before calculating PNSR/SSIM?
I refer the eval code on SelfExSR repo, which crops the scale_factor border pixels.
Perfect! Thanks a lot!
Hey @nmhkahn,
Thanks for sharing code.
I'm trying to reproduce your results so I want to know how many steps did you train the network to achieve the results and after how many steps did you start to decay?
Thanks in advanced!
@Auth0rM0rgan
We trained 600k steps and decay at every 400k steps, just like the example running code in the README.
Hey @nmhkahn,
Thanks for the reply. Just one question about the CARN checkpoint you provided. Is it the last checkpoint of your model after 600K or the best one?
Thanks!
We trained a few models 600K entirely and pick the best one, but you can pick the best steps if you train a single model.
@nmhkahn hi,
I just got touched super-resolution,I don't know how to test or train it converting RGB to Y.
Do I need modify this codes to Y channels to train and test it ?
Or using the checkpoint of RGB channels converts Y channels to test on matlab ?
I never test it on Matlab,always tensorflow or pytorch,how to operation it ?
looking forward to your reply.thanks.
@zymize Hi.
First of all, my code works on the RGB channel and the network produces RGB images. And if you want to benchmark on the [Set5, Set14, B100, Urban100], 1) get the RGB images using CARN 2) convert it to Y channel image (gray-scale image) 3) test with MATLAB.
Many MATLAB version of evaluation codes performs 2 and 3 steps simultaneously such as this.
I'm also trying to reproduce the results. I notice a difference between matlab and python psnr implementation.
Using the matlab code:
pkg load image
im1 = imread('/datasets/Set5/image_SRF_4/img_001_SRF_4_bicubic.png');
im2 = imread('/datasets/Set5/image_SRF_4/img_001_SRF_4_HR.png');
d = compute_difference(im1, im2, 4)
returns d = 31.771
Using python:
from piq import psnr
from torchvision.io import read_image
from torchvision.transforms import RandomCrop, Resize, GaussianBlur, Compose, Normalize, CenterCrop
from torchmetrics import PeakSignalNoiseRatio
import torch
import torch.nn as nn
from torch import Tensor
from torch.nn.functional import mse_loss as mse
from color import rgb_to_ycbcr
im1 = read_image('/datasets/Set5/image_SRF_4/img_001_SRF_4_bicubic.png')
im2 = read_image('/datasets/Set5/image_SRF_4/img_001_SRF_4_HR.png')
border = 4*2
border_removal = Compose([
CenterCrop((int(im1.shape[1]-border), int(im1.shape[2]-border))),
])
im1 = border_removal(im1)
im2 = border_removal(im2)
# using piq module
p = psnr(im1.float().unsqueeze(0), im2.float().unsqueeze(0), data_range=255., convert_to_greyscale=True, reduction='mean')
print(p)
# using torchmetrics module
psnr2 = PeakSignalNoiseRatio()
p = psnr2(rgb_to_ycbcr(im1.float())[0,:,:], rgb_to_ycbcr(im2.float())[0,:,:])
print(p)
# "manually" computing psnr
max_val = 255.0
print(10.0 * torch.log10(max_val**2 / mse(rgb_to_ycbcr(im1.float())[0,:,:], rgb_to_ycbcr(im2.float())[0,:,:], reduction='mean')))
gives:
tensor(30.5169)
tensor(30.5169)
tensor(30.5169)
Not sure where the difference comes from... maybe the RGB to YCbCr is different?