rfuruta/pixelRL

Regarding Testing Setup

Closed this issue · 4 comments

Hi!
This is more general doubt related to the concept and not directly the implementation. Can you please explain how the inferencing works? From the paper it seems that you already have to know the final denoised image (to calculate rewards and to improve iteratively). So how does this work when you are on a test set in which case you don't have access to the actual groundtruths?

Thank you for your question.
In the training phase, we train the network to choose appropriate actions when the current states are given because we have access to the groundtruths. So, if the test images are similar enough to the train images (e.g., test and train images are from same dataset or same domain), the network can choose appropriate actions on test images even if they are unseen.

Thanks for the quick response!
So essentially, the reward calculation on the test images is redundant right? Because you can't expect to have ground truth of test images and consequently you can't define the reward. In any case, the reward on test images is not useful since you already have a trained policy and you will be following that policy irrespective of the reward you get on test image.

Please just confirm if my understanding is correct
Thank you

Yes, you are totally right. The rewards on test images are redundant and not useful.

Thanks for the clarification!