/neural-imaging

[CVPR'19, ICLR'20] A Python toolbox for modeling and optimization of photo acquisition & distribution pipelines (camera ISP, compression, forensics, manipulation detection)

Primary LanguagePython

Neural Imaging Toolbox

Authors: Paweł Korus and Nasir Memon, New York University

A Python+Tensorflow toolbox for modeling and optimization of photo acquisition, distribution, and forensic analysis. It enables joint optimization of various components, e.g.:

  • fine-tuning of camera ISP to actively introduce traces useful for manipulation detection,
  • fine-tuning of lossy compression to leave more relevant statistical traces in the photographs.

training for optimized manipulation detection

The toolbox provides several general-purpose components (camera ISP, differentiable JPEG, learned lossy compression, state-of-the-art image forensics) that can be combined into various workflows.

ℹ️ A standalone version of our lossy compression codec can be found in the l3ic repository.

⚠️ The current implementation uses Tensorflow 2.x. Legacy versions using Tensorflow 1.x can be accessed via git history.

References

  1. P. Korus, N. Memon, Content Authentication for Neural Imaging Pipelines: End-to-end Optimization of Photo Provenance in Complex Distribution Channels, CVPR'19, arxiv:1812.01516
  2. P. Korus, N. Memon, Neural Imaging Pipelines - the Scourge or Hope of Forensics?, 2019, arXiv:1902.10707
  3. P. Korus, N. Memon, Quantifying the Cost of Reliable Photo Authentication via High-Performance Learned Lossy Representations, ICLR'20, openreview

Pre-trained Models

Change Log

  • 2020.04 - Ported to Tensorflow 2.x + added workflows, classic ISPs, JPEG improvements
  • 2019.12 - Added support for learned compression, configurable manipulations + major refactoring

Setup

The toolbox was written in Python 3. Follow the standard procedure to install dependencies. Some modules need to be complied for your platform.

> git clone https://github.com/pkorus/neural-imaging && cd neural-imaging
> pip3 install -r requirements.txt
> mkdir -p data/{raw,rgb}
> git submodule init
> cd pyfse && make && cd ..

Sample datasets and pre-trained models can be downloaded at pkorus.pl/downloads/neural-imaging-resources.

Data Directory Structure

The toolbox uses the data directory to store images, training data and pre-trained models:

config/                                 - various configuration files
data/raw/                               - RAW images used for camera ISP training
  |- images/{camera name}                 input RAW images (*.nef *.dng)
  |- training_data/{camera name}          Bayer stacks (*.npy) and developed (*.png)
  |- developed/{camera name}/{nip}        output RGB images (*.png)
data/rgb/                               - RGB images used for compression training
  |- kodak                                A sample dataset with kodak images
data/models                             - pre-trained TF models
  |- nip/{camera name}/{nip}              NIP models (TF checkpoints)
  |- isp                                  Classic ISP models (TF checkpoints)
  |- dcn/baselines/{dcn model}            DCN models (TF checkpoints)
data/m                                  - manipulation training results
data/results                            - CSV files with exported results

Getting Started

A great place to get started quickly is the getting_started.ipynb notebook.

Available Models

Camera ISP Models

Pipeline Description
ClassicISP a standard ISP model with neural demosaicing
INet simple NIP which mimics step-by-step processing of the standard pipeline
UNet the well known UNet network
DNet medium-sized model adapted from a recent architecture for joint demosaicing and denoising
ONet dummy model for directly feeding RGB images into the pipeline
libRAW uses the libRAW library to develop RAW images
Python simple Python implementation of a standard pipeline

Fully differentiable pipelines (ClassicISP and *Net) are implemented in models/pipelines. The last two standard pipelines are not differentiable and are not integrated with other models - they are available in the helpers.raw module.

JPEG Codec

We provide a fully differentiable implementation of the JPEG codec and a high-level interface that allows for switching between the differentiable codec and libJPEG. For more information see section JPEG Codecs and docstrings in models.jpeg.JPEG.

Learned Codec

Our neural image compression codec is an adaptation of the auto-encoder architecture proposed by Twitter (Theis et al., Lossy Image Compression with Compressive Autoencoders), and hence is dubbed TwitterDCN (see models/compression.py). A standalone version is also available in the neural-image-compression repository.

Image Forensics

Our Forensic Analysis Network (FAN) follows the state-of-the-art design principles and uses a constrained convolutional layer proposed in:

While the original model used only the green channel, our FAN uses full RGB information for forensic analysis. See the models.forensics.FAN class for our Tensorflow implementation.

Pre-training

We generally follow a 2-step protocol with separate model pre-training (camera ISP, compression) and joint optimization/fine-tuning for specific applications (retraining from scratch is also possible, but has not been tested extensively).

Workflows

Individual components can be combined into workflows that model specific applications and allow for joint optimization of the entire pipeline. The current version of the toolbox provides an example workflow for manipulation classification - a standard benchmark for image forensics.

Manipulation Classification

The manipulation classification workflow involves training a forensic analysis network (FAN) to identify subtle post-processing operations applied to the image. The model starts with the camera ISP and is followed by photo manipulations and a distribution channel. The FAN can access images after they have been degraded (e.g., down-sampled and compressed) by the channel. The model is shown below:

Extending the Framework

The framework can be easily extended. See the extensions section for information how to get started. If you would like to contribute new models, training protocols, or if you find any bugs, please let us know.

Script Summary

Script Description
develop_images.py batch rendering of RAW images using various camera ISPs
diff_nip.py compare RAW rendering results of two camera ISPs
results.py visualization of FAN optimization results
test_dcn.py test neural image compression / generate rate-distortion profiles
test_dcn_rate_dist.py plot rate-distortion (R/D) curves for neural image compression (requires pre-computing R/D profiles using test_dcn.py)
test_fan.py allows for testing trained FAN models on various datasets
test_framework.py a rudimentary test of the entire framework (see testing)
test_jpeg.py test differentiable approximation of the JPEG codec
test_nip.py test a pre-trained camera ISP
train_dcn.py pre-train lossy compression
train_manipulation.py optimization of the FAN (+NIP/DCN) for manipulation detection
train_nip.py pre-train camera ISPs
train_prepare_training_set.py prepare training data for camera ISP pre-training (imports RAW images)
summarize_nip.py extracts and summarizes performance stats for standalone NIP models

Data Sources

In our experiments we used RAW images from publicly available datasets:

Usage and Citations

This code is provided for educational purposes and aims to facilitate reproduction of our results, and further research in this direction. We have done our best to document, refactor, and test the code before publication. However, the toolbox is provided "as-is", without warranties of any kind.

If you find this code useful in your work, please cite our papers:

@inproceedings{korus2019content,
  title={Content Authentication for Neural Imaging Pipelines: End-to-end Optimization of Photo Provenance in Complex Distribution Channels},
  author={Korus, Pawel and Memon, Nasir},
  booktitle={IEEE Conf. Computer Vision and Pattern Recognition},
  year={2019}
}
@article{korus2019neural,
  title={Neural Imaging Pipelines - the Scourge or Hope of Forensics?},
  author={Korus, Pawel and Memon, Nasir},
  journal={arXiv preprint arXiv:1902.10707},
  year={2019}
}
@inproceedings{korus2020quantifying,
  title={Quantifying the Cost of Reliable Photo Authentication via High-Performance Learned Lossy Representations},
  author={Korus, Pawel and Memon, Nasir},
  booktitle={IEEE Conf. Learning Representations},
  year={2020}
}

Related Work

End-to-end ISP optimization:

Learned Compression

Acknowledgements

This work was supported by the NSF award number 1909488.