VideoOneNet: Bidirectional Convolutional Recurrent OneNet with Trainable Data Steps for Video Processing
Paper (.pdf)
Code is mostly self-explanatory via file, variable and function names; but more complex lines are commented.
Designed to require minimal setup overhead, using as much Keras and sacred integration and reusability as possible.
Installing Python 3.7.9 on Ubuntu 20.04.2 LTS:
sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt-get update
sudo apt-get install python3.7
Installing CUDA 10.0:
wget https://developer.nvidia.com/compute/cuda/10.0/Prod/local_installers/cuda_10.0.130_410.48_linux
sudo bash cuda_10.0.130_410.48_linux --override
echo 'export PATH=/usr/local/cuda-10.0/bin:$PATH' >> ~/.bashrc
echo 'export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64:$LD_LIBRARY_PATH' >> ~/.bashrc
source ~/.bashrc
Installing cuDNN 7.6.5:
wget http://people.cs.uchicago.edu/~kauffman/nvidia/cudnn/cudnn-10.0-linux-x64-v7.6.5.32.tgz
# if link is broken, login and download from nvidia:
# https://developer.nvidia.com/compute/machine-learning/cudnn/secure/7.6.5.32/Production/10.0_20191031/cudnn-10.0-linux-x64-v7.6.5.32.tgz
tar -xvzf cudnn-10.0-linux-x64-v7.6.5.32.tgz
sudo cp -r cuda/include/* /usr/local/cuda-10.0/include/
sudo cp -r cuda/lib64/* /usr/local/cuda-10.0/lib64/
Installing Python packages with pip:
python3.7 -m pip install h5py==2.10.0 ipython==7.16.1 keras==2.2.4 matplotlib==3.3.2 numpy==1.19.2 pillow==8.1.0 pywavelets==1.1.1 sacred==0.8.2 scikit-learn==0.23.2 scipy==1.5.2 tensorflow-gpu==1.14.0 tqdm==4.56.0
Reproduction should be as easy as executing this in the root folder (after installing all dependencies):
python3.7 -m IPython experiments/mnistrotated.py with videoonenet seed=123
In general:
python3.7 -m IPython experiments/dataset.py with algorithm optional_config seed=number
where dataset
is either:
mnistrotated
: the Rotated MNIST video set, artificially generated by rotating and picking the top left corner,cifar10scanned
: the Scanned CIFAR-10 video set, artificially generated by sliding a window,ucf101
: the UCF-101 natural video set;
algorithm
is either:
videoonenet
: our videoonenet method with 2 contributions,videoonenetadmm
: the original onenet baseline method,videowaveletsparsityadmm
: the wavelet sparsity baseline method;
and optional_config
is either nothing (convolutional minimal recurrent layer enabled by default), or:
rnn
: convolutional minimal recurrent layer enabled,nornn
: convolutional minimal recurrent layer disabled.
seed
: 123
in all of our experiments, should yield very similar numbers as in the table of our paper
algorithms/
kerasvideoonenet.py : base class, our videoonenet method with 2 contributions
kerasvideoonenet_admm.py : subclass, the original onenet baseline method
numpyvideowaveletsparsity_admm.py : subclass, the wavelet sparsity baseline method (CPU-only)
datasets/
mnistrotated.py : base class, loads Rotated MNIST data set and generates linear inverse problems on the fly
cifar10scanned.py : subclass, same but for Scanned CIFAR-10
ucf101.py : subclass, same but for UCF-101
experiments/
mnistrotated.py : config file for hyperparameters, loads Rotated MNIST data set and an algorithm,
conducts experiment; requires ~8Gb GPUs for training
cifar10scanned.py : same, but for Scanned CIFAR-10; requires ~12Gb GPUs for training
ucf101.py : same, but for UCF-101, requires 2*24Gb GPUs for training
results/ : experimental results will be saved to this directory with sacred package
utils/
layers.py : custom Keras layer classes, including
ConvMinimalRNN2D
: the convolutional minimal recurrent layer
InstanceNormalization
: the instance normalization layer
UnrolledOptimization
: the layer responsible for end-to-end trainable ADMM iterations, the core of
our algorithms
ops.py : custom Keras/Tensorflow operations, including
elu_like
: the activation function
batch_pinv
: batched Moore-Penrose pseudoinverse computation
pil.py : functions for backwards compatibility for saving all kinds of figures
plot.py : functions for saving video frame figures
problems.py : functions for the linear inverse problem set, each generating the matrix A^{(n)} and an auxiliary
shape for saving figures
utils.py : additional things, including
LinearInverseVideoSequence
: Keras Sequence subclass generating random videos and linear inverse
problems from the given problem set
@inproceedings{milacski2020videoonenet,
title={VideoOneNet: Bidirectional Convolutional Recurrent OneNet with Trainable Data Steps for Video Processing},
author={Milacski, Zolt{\'a}n and Poczos, Barnabas and Lorincz, Andras},
booktitle={International Conference on Machine Learning},
pages={6893--6904},
year={2020},
organization={PMLR}
}
In case of any questions, feel free to create an issue here on GitHub, or mail me at srph25@gmail.com.