Accelerating the Super-Resolution Convolutional Neural Network

Replicating the results of this paper: https://arxiv.org/pdf/1608.00367.pdf
Authors: Chao Dong, Chen Change Loy, and Xiaoou Tang
InstitutionDepartment of Information Engineering, The Chinese University of Hong Kong

How To Run

1. With Docker
cd app
docker build -t fsrcnn
docker run -p 8000:8000 fsrcnn
Paste this in your browser: http://localhost:8000/

2. Without Docker
cd app
python3 api/app.py

Requirements

Python==3.11.5
NumPy==1.26.3
Pytorch==2.1.2 (with cuda)
Matplotlib
PIL
Pathlib
glob
zipfile

Results

Note: I havent had the time to train a scale of 2 or 4 yet as it takes all day but it is coming soon

Eval. Mat	Scale	Paper	Mine
PSNR	2	36.94	34.77
PSNR	3	33.16	32.05
PSNR	4	30.55	30.82

Original	Original Cropped	BICUBIC x3	FSRCNN x3

Original	BICUBIC x3	FSRCNN x3

Mean Squared Error vs Mean Absoulte Error Comparison

Original	Original Cropped	BICUBIC x3	FSRCNN x3 MSE	FSRCNN x3 MAE

Model Architecture

Structure: Conv(5, d, 1) −> PReLU −> Conv(1, s, d) −> PReLU −> m×Conv(3, s, s) −> PReLU −> Conv(1, d, s) −> PReLU −> DeConv(9, 1, d)

Differences:
Instead of using L2 loss, as used in the paper, I used L1 loss as "using MSE or a metric based on MSE is likely to result in training finding a deep learning based blur filter, as that is likely to have the lowest loss and the easiest solution to converge to minimising the loss. A loss function that minimises MSE encourages finding pixel averages of plausible solutions that are typically overly smoothed and although minimising the loss, the generated images will have poor perceptual quality from a perspective of appealing to a human viewer."

I opted to use L1 loss because "with L1 loss, the goal is the least absolute deviations (LAD) to minimise the sum of the absolute differences between the ground truth and the predicted/generated image. MAE reduces the average error, whereas MSE does not. Instead, MSE is very prone to being affected by outliers. For Image Enhancement, MAE will likely result in an image which appears to be a higher quality from a human viewer’s perspective."

https://towardsdatascience.com/deep-learning-image-enhancement-insights-on-loss-function-engineering-f57ccbb585d7

Web App Demo

File Overview

notebooks

02_sandbox.ipynb
- Jupyter notebook that contains everything in one place from ingestion to predictions. This is what I used as a rough draft before restructuring into .py files

utils

helpers.py
- Python file containing helper functions I either created or found to assist with this project.
datasets.py
- Python file containing the custom datasets needed to train this model. Includes the Train and Evaluation datasets as they require different things to function as needed.
models.py
- Python file that contains the model consisting of layers for feature extraction, shrinking, non-linear mapping, expanding, and deconvolution. Uses PReLU instead of ReLU as it is more stable and avoids 'dead features' caused by zero_grad.
train.py
- Python file that trains the model using methods train_step, test_step, and train. Evaluates the model using Peak Signal-to-Noise Ratio(PSNR) measured in db.

NicoCeresa/FSRCNN-2016