How to run inference on just the test data provided by you using the trained model

Question

How to run inference on just the test data provided by you using the trained model

pranaysingh25 opened this issue 3 years ago · 14 comments

Hi,

Amazing contribution I must say, thanks for that.
I was more interested so I digged into the code and implementation, I wanted to produce evaluation results on the test dataset of GOPRO_Large provided by you.

What i have done: I have just downloaded, unzipped and placed the trained models and the dataset into their respective folders as suggested in the readme and comments across the code.

Now what am I exactly looking for is the accurate command that just picks up the trained model and uses it to reproduce results on the testing data(right, those 1111 number of images out of 3214 total) , I don't want to train the whole model for that again and neither do I wanna use the demo functionality provided by you, I wanted to run evaluation of model on testing data standalone. However I was little confused at few places while reading your code base that what do I need to change or omit since there was no clear example command for this in the readme.

Can you please guide me through ? I am just looking for the command and/or any change I need to do in the code to run evaluation on testing data.

Answer 1 · 2021-08-10T01:59:30.000Z

HI @pranaysingh25,

You can use do_train and do_test arguments, given that the data is located as in README.

python main.py --save_dir GOPRO_L1 --do_train false --do_test true --save_results all --dataset GOPRO_Large

The above command will load the model from the last epoch and run testing without additional training.

You can use the example demo commands in README Demo.

For example, if your downloaded model is located in DeepDeblur-Pytorch/experiment/GOPRO_L1, you can run

# single GPU (GOPRO_Large, single precision)
python main.py --save_dir GOPRO_L1 --demo true --demo_input_dir ~/Research/dataset/GOPRO_Large/test/GOPR0384_11_00/blur_gamma

Answer 2 · 2021-08-12T06:11:55.000Z

HI @pranaysingh25,

You can use do_train and do_test arguments, given that the data is located as in README.
python main.py --save_dir GOPRO_L1 --do_train false --do_test true --save_results all --dataset GOPRO_Large
The above command will load the model from the last epoch and run testing without additional training.

You can use the example demo commands in README Demo.

For example, if your downloaded model is located in DeepDeblur-Pytorch/experiment/GOPRO_L1, you can run
# single GPU (GOPRO_Large, single precision)
python main.py --save_dir GOPRO_L1 --demo true --demo_input_dir ~/Research/dataset/GOPRO_Large/test/GOPR0384_11_00/blur_gamma

Hi @SeungjunNah,

Thanks for the prompt reply, you gave two commands, The first one was the one I actually tried before reaching to you, it just ran in a few seconds and printed that results are saved, I checked that folder there was no results.
Here is my command:
python main.py --save_dir GOPRO_L1 --do_train false --do_test true --save_result all --dataset GOPRO_Large --data_root /content/drive/MyDrive/Ampviv/deblur/nah/DeepDeblur-PyTorch-master/dataset/

and here is the output..in the second cell.

as you can see, it says "results are saved in ../experiment/GOPRO_L1/result"
There are two problems:

this command ran successfully within like 3-4 seconds, I doubt it has done anything else than loading model and the dataset.
The ../experiment/GOPRO_L1/result directory is completely empty as you can see below(files inside will open in a tree structure)

So basically, there may be some glitch or are we missing some flag? I also checked the source code, what it is doing is just calling fill_evaluation() function for test data(I may be wrong) but this is not working.

Your second command that you gave me that uses demo function works as perfect..
This is how I used it:
!python main.py --save_dir GOPRO_L1 --demo true --demo_input_dir /content/drive/MyDrive/Ampviv/deblur/nah/DeepDeblur-PyTorch-master/dataset/GOPRO_Large/test/GOPR0384_11_00/blur_gamma

and this is the output:

This is indeed wonderful, this takes time to run and I can see output in the result directory as you can see in the picture above, but this doesn't solve my purpose, I needed results in SSIM and PSNR. It just gives me output images which are deblurred.
I have two options now:
Either I can use this demo flag command to somehow get the metrics as well(may be adding an additional flag ?) where I would just put all of the testing data in one folder which is demo input directory, or run the 1st command given by you successfully.
Can you check it out ? I think there's something missing in the 1st command, I am so close. The --data_root is accurate IMO.
If you can check it out it would be lot helpful for me.

Thanks,
Pranay

Answer 3 · 2021-08-12T06:36:41.000Z

Hi @SeungjunNah ,

Like I said , the 1st command you gave, i.e. python main.py --save_dir GOPRO_L1 --do_train false --do_test true --save_result all --dataset GOPRO_Large --data_root /content/drive/MyDrive/Ampviv/deblur/nah/DeepDeblur-PyTorch-master/dataset/ calls the fill_evaluation() function from train.py.
But it doesn't calls the evaluate() function in the file. Let's see the image below:

as you can see, by some print statements I added to see where is the code is going, the code actually doesn't enters the if do_eval: block and hence self.evaluate(epoch, self.mode) doesn't gets called. That means loss_metric and metric_missing bool values comes as False.

Answer 4 · 2021-08-12T13:06:51.000Z

Oh, I'm sorry for missing the additional arguments you need.
Could you try running the following command, by specifying --load_epoch, --start_epoch, and --end_epoch?

python main.py --save_dir GOPRO_L1 --do_train false --do_test true --save_result all --dataset GOPRO_Large --load_epoch 1000 --start_epoch 1000 --end_epoch 1000

As you noticed, fill_evaluation function is there to avoid duplicated training and evaluation if loss/metric records are already found.
You can bypass the condition with the above command.

Answer 5 · 2021-08-12T14:25:34.000Z

HI @pranaysingh25,

You can use do_train and do_test arguments, given that the data is located as in README.
python main.py --save_dir GOPRO_L1 --do_train false --do_test true --save_results all --dataset GOPRO_Large
The above command will load the model from the last epoch and run testing without additional training.

You can use the example demo commands in README Demo.

For example, if your downloaded model is located in DeepDeblur-Pytorch/experiment/GOPRO_L1, you can run
# single GPU (GOPRO_Large, single precision)
python main.py --save_dir GOPRO_L1 --demo true --demo_input_dir ~/Research/dataset/GOPRO_Large/test/GOPR0384_11_00/blur_gamma

Hi @SeungjunNah ,

That's perfect! Now it works like a charm. I can see the output and the metrics too.
Thanks for sticking by. That helped alot.

Answer 6 · 2021-08-13T07:51:47.000Z

Hey @SeungjunNah

There's one last thing,
During evaluation (using the command you gave) there is nice real time progress printed on the console which prints Loss, PSNR and SSIM values and keeps updating it as we progress and gives the average at the end of the run.

I wanted to dump this information in a file, i wanted to know where this information is getting calculated in the code - per image.. I just wanna go there and put a block of code which picks that per image ssim and psnr values and dumps into a csv file row by row.
Could you tell where shall put my code ? That's where the per image psnr and ssim is returned ...

Answer 7 · 2021-08-13T08:04:10.000Z

Hi @pranaysingh25,

In train.py line 158, you will see a function self.criterion(output, target).
The criterion forward function is defined in loss/__init__.py, line 152

You will see that it computes

loss
and metrics by calling self.measure in line 202.

You can get access to the measured PSNR, SSIM in line 232.
The variable metric_type is either PSNR or SSIM, and _metric is the calculated value.

Modifying the existing code could be complex as the loss and metrics are calculated in different places.
You can choose to evaluate PSNR and SSIM from the saved results via calling metric functions defined in loss/metric.py

Answer 8 · 2021-08-13T08:11:20.000Z

Hi @SeungjunNah

As you said "You can choose to evaluate PSNR and SSIM from the saved results via calling metric functions defined in loss/metric.py" was my hunch as well, thanks for validating that. I think I am gonna go this way, that sounds easy.

Thanks again.

Answer 9 · 2021-08-18T08:53:00.000Z

Hi @SeungjunNah ,

I am done with this milestone of testing the pretrained model on my sample datasets.

Moving forward I wanted to do the training using the same model architecture but on my sample training data.
Is that possible ?

If in the GoPro training set, I replace the images with my training data and run the training with --dataset GoPro_Large argument, that could be one way ?
Or is there already a command to do that without having to use the above method, like just give path of new training data and it will train the model ..?

Or anything else ? Wanted to know that if this is possible or would that requires changes in the code too and not too straight just by executing a command and providing new data.
If its simple, please let me know, I am quite curious.

Thanks,
Pranay

Answer 10 · 2021-08-18T09:01:11.000Z

Hi @pranaysingh25,

You can write a new dataset class and use the --dataset command with your new class name.
In gopro_large.py, GOPRO_Large class inherits Dataset class that is defined in dataset.py.

You may want to check data/__init__.py and how the dataset class is used.

Answer 11 · 2021-08-31T06:34:46.000Z

Hi @SeungjunNah ,

Thanks, I was able to do it. As one of experiments we wanted to use transfer learning to train a model on another dataset(or lets say even on your GoPro dataset). We wanted to use the model given by you and the data provided by you, but the pretrained model given shall be used to as an initialization point for the weights(or even in the future I could freeze the initial modules and let just the later modules in the model to train)
I tried this command:
!python main.py --save_dir GOPRO_L1 --do_train true --do_test true --save_result all --dataset GOPRO_Large --load_epoch 1000 --start_epoch 1000 --end_epoch 1002 --data_root /content/DeepDeblur-PyTorch-master/Dataset/GOPRO_Large

I have set end epochs to 1002, just wanted to check whether its working. I am not sure whether this command is right or do I need to make some changes in code, but the above command gives me GPU error: CUDA out of memory error.

Answer 12 · 2021-08-31T07:20:39.000Z

You can use --pretrained argument to specify the pretrained model location.

Answer 13 · 2021-08-31T08:17:00.000Z

You can use --pretrained argument to specify the pretrained model location.

Hi @SeungjunNah ,

Thanks for your prompt response,

I tried this command as you suggested:

Allow me to reiterate briefly what I was trying to do with it:

I want to load your pretrained model and use its weights as an initialization point for the model to further train it on GOPRO dataset, for a few epochs.
Once I am able to load your model with saved weights for training, I am gonna freeze initial modules of it using:

for name, param in self.model.named_parameters:
....if name in [list of modules I want to freeze]:
........ param.requires_grad = False

I tried your command to do (1), i.e to load pretrained model, command is:
!python main.py --pretrained /content/DeepDeblur-PyTorch-master/experiment/GOPRO_L1/models/model-1000.pt --save_dir GOPRO_L1 --do_train true --do_test true --dataset GOPRO_Large --load_epoch 1000 --start_epoch 1000 --end_epoch 1010 --data_root /content/DeepDeblur-PyTorch-master/Dataset/

But this says:
starting from epoch 1000! ignoring pretrained model path..
Why would it ignore the pretrained model path ? Am I using the right command ?

Here's a look into that:

Answer 14 · 2021-08-31T08:30:34.000Z

I assume that from the pretrained model, you want to start a new training schedule.
Then the starting epoch should be 1, not 1001.

You may want to do something like:
python main.py --save_dir NEW_EXPERIMENT_NAME --pretrained PRETRAINED_MODEL_PATH.pt --dataset GOPRO_Large

Also, I would rather call nn.Module.requires_grad_() function than setting manually assigning the sign value to each parameters.