Neural Enhance
Example #1 — Old Station: view comparison in 24-bit HD, original photo CC-BY-SA @siv-athens.
As seen on TV! What if you could increase the resolution of your photos using technology from CSI laboratories? Thanks to deep learning and #NeuralEnhance
, it's now possible to train a neural network to zoom in to your images at 2x or even 4x. You'll get even better results by increasing the number of neurons or training with a dataset similar to your low resolution image.
The catch? The neural network is hallucinating details based on its training from example images. It's not reconstructing your photo exactly as it would have been if it was HD. That's only possible in Hollywood — but using deep learning as "Creative AI" works and it is just as cool! Here's how you can get started...
1. Examples & Usage
The main script is called enhance.py
, which you can run with Python 3.4+ once it's setup as below. The --device
argument that lets you specify which GPU or CPU to use. For the samples above, here are the performance results:
- GPU Rendering HQ — Assuming you have CUDA setup and enough on-board RAM to fit the image and neural network, generating 1080p output should complete in 5 seconds, or 2s per image if multiple at the same time.
- CPU Rendering HQ — This will take roughly 20 to 60 seconds for 1080p output, however on most machines you can run 4-8 processes simultaneously given enough system RAM. Runtime depends on the neural network size.
The default is to use --device=cpu
, if you have NVIDIA card setup with CUDA already try --device=gpu0
. On the CPU, you can also set environment variable to OMP_NUM_THREADS=4
, which is most useful when running the script multiple times in parallel.
1.a) Enhancing Images
# Run the super-resolution script for one image, factor 1:1.
python3 enhance.py --zoom=1 example.png
# Also process multiple files with a single run, factor 2:1.
python3 enhance.py --zoom=2 file1.jpg file2.jpg
# Display output images that were given `_ne?x.png` suffix.
open *_ne?x.png
1.b) Training Super-Resolution
Pre-trained models are provided in the GitHub releases. Training your own is a delicate process that may require you to pick parameters based on your image dataset.
# Remove the model file as don't want to reload the data to fine-tune it.
rm -f ne4x*.pkl.bz2
# Pre-train the model using perceptual loss from paper [1] below.
python3.4 enhance.py --train "data/*.jpg" --model custom --scales=2 --epochs=50 \
--perceptual-layer=conv2_2 --smoothness-weight=1e7 --adversary-weight=0.0 \
--generator-blocks=4 --generator-filters=64
# Train the model using an adversarial setup based on [4] below.
python3.4 enhance.py --train "data/*.jpg" --model custom --scales=2 --epochs=250 \
--perceptual-layer=conv5_2 --smoothness-weight=2e4 --adversary-weight=1e3 \
--generator-start=5 --discriminator-start=0 --adversarial-start=5 \
--discriminator-size=64
# The newly trained model is output into this file...
ls ne4x-custom-*.pkl.bz2
Example #2 — Bank Lobby: view comparison in 24-bit HD, original photo CC-BY-SA @benarent.
2. Installation & Setup
2.a) Using Docker Image [recommended]
The easiest way to get up-and-running is to install Docker. Then, you should be able to download and run the pre-built image using the docker
command line tool. Find out more about the alexjc/neural-enhance
image on its Docker Hub page.
Single Image — We suggest you setup an alias called enhance
to automatically expose the folder containing your specified image, so the script can read it and store results where you can access them. This is how you can do it in your terminal console on OSX or Linux:
# Setup the alias. Put this in your .bashrc or .zshrc file so it's available at startup.
alias enhance='function ne() { docker run --rm -v "$(pwd)/`dirname ${@:$#}`":/ne/input -it alexjc/neural-enhance ${@:1:-1} "input/`basename ${@:$#}`"; }; ne'
# Now run any of the examples above using this alias, without the `.py` extension.
enhance --zoom=1 --model=small images/example.jpg
Multiple Images — To enhance multiple images in a row (faster) from a folder or widlcard specification, make sure to quote the argument to the alias command:
# Process multiple images, make sure to quote the argument!
enhance --zoom=2 --model=small "images/*.jpg"
If you want to run on your NVIDIA GPU, you can instead change the alias to use the image alexjc/neural-enhance:gpu
which comes with CUDA and CUDNN pre-installed. Then run it within nvidia-docker and it should use your physical hardware!
2.b) Manual Installation [developers]
This project requires Python 3.4+ and you'll also need numpy
and scipy
(numerical computing libraries) as well as python3-dev
installed system-wide. If you want more detailed instructions, follow these:
- Linux Installation of Lasagne (intermediate)
- Mac OSX Installation of Lasagne (advanced)
- Windows Installation of Lasagne (expert)
Afterward fetching the repository, you can run the following commands from your terminal to setup a local environment:
# Create a local environment for Python 3.x to install dependencies here.
python3 -m venv pyvenv --system-site-packages
# If you're using bash, make this the active version of Python.
source pyvenv/bin/activate
# Setup the required dependencies simply using the PIP module.
python3 -m pip install --ignore-installed -r requirements.txt
After this, you should have pillow
, theano
and lasagne
installed in your virtual environment. You'll also need to download this pre-trained neural network (VGG19, 80Mb) and put it in the same folder as the script to run. To de-install everything, you can just delete the #/pyvenv/
folder.
Example #3 — Specialized super-resolution for faces, trained on HD examples of celebrity faces only. The quality is significantly higher when narrowing the domain from "photos" in general.
3. Background & Research
This code uses a combination of techniques from the following papers, as well as some minor improvements yet to be documented (watch this repository for updates):
- Perceptual Losses for Real-Time Style Transfer and Super-Resolution
- Real-Time Super-Resolution Using Efficient Sub-Pixel Convolution
- Deeply-Recursive Convolutional Network for Image Super-Resolution
- Photo-Realistic Super-Resolution Using a Generative Adversarial Network
Special thanks for their help and support in various ways:
- Eder Santana — Discussions, encouragement, and his ideas on sub-pixel deconvolution.
- Andrew Brock — This sub-pixel layer code is based on his project repository using Lasagne.
- Casper Kaae Sønderby — For suggesting a more stable alternative to sigmoid + log as GAN loss functions.
4. Troubleshooting Problems
Can't install or Unable to find pgen, not compiling formal grammar.
There's a Python extension compiler called Cython, and it's missing or improperly installed. Try getting it directly from the system package manager rather than PIP.
FIX: sudo apt-get install cython3
NotImplementedError: AbstractConv2d theano optimization failed.
This happens when you're running without a GPU, and the CPU libraries were not found (e.g. libblas
). The neural network expressions cannot be evaluated by Theano and it's raising an exception.
FIX: sudo apt-get install libblas-dev libopenblas-dev
TypeError: max_pool_2d() got an unexpected keyword argument 'mode'
You need to install Lasagne and Theano directly from the versions specified in requirements.txt
, rather than from the PIP versions. These alternatives are older and don't have the required features.
FIX: python3 -m pip install -r requirements.txt
ValueError: unknown locale: UTF-8
It seems your terminal is misconfigured and not compatible with the way Python treats locales. You may need to change this in your .bashrc
or other startup script. Alternatively, this command will fix it once for this shell instance.
FIX: export LC_ALL=en_US.UTF-8
Example #4 — Street View: view comparison in 24-bit HD, original photo CC-BY-SA @cyalex.