/cae

Compressive Autoencoder.

Primary LanguagePythonMIT LicenseMIT

Compressive Autoencoder

Discussions Wiki Code style: black

Getting started

The quickest way to start experimenting is to use this model trained on this smaller dataset. An arbitrary dataset can be constructed by downloading frames using the scripts provided here.

See wiki for more details, download links and further results.

Training

python train.py --config ../configs/train.yaml

Example train.yaml:

exp_name: training

num_epochs: 1
batch_size: 16
learning_rate: 0.0001

# start fresh
resume: false
checkpoint: null
start_epoch: 1

batch_every: 1
save_every: 10
epoch_every: 1
shuffle: true
dataset_path: datasets/yt_small_720p
num_workers: 2
device: cuda

Testing

Given a trained model (checkpoint), perform inference on images @ dataset_path.

python test.py --config ../configs/test.yaml

Example test.yaml:

exp_name: testing
checkpoint: model.state
batch_every: 100
shuffle: true
dataset_path: datasets/testing
num_workers: 1
device: cuda

Detecting On Single Image/Frame

Run detect.py with the following arguments:

optional arguments:
  -h, --help       show this help message and exit
  --config CONFIG
  --image IMAGE
  --size
  --patch PATCH
  --resize

For example, this command: python3 detect.py --config ../configs/test.yaml --image test.jpg --patch 512 will run the 32x32x32 CAE model on test.jpg with a patch-size of 512. --resize means that the patchsize input will be resized to 128x128. Without --resize, the network will grow to fit the input size, which increases the memory usage and limiting speed. The bottleneck feature size will be the same.

Note: Currently, smoothing (i.e. linear interpolation in smoothing.py) is used in order to account for the between-patches noisy areas due to padding (this still needs further investigation).

Results

  • cae_32x32x32_zero_pad_bin model
  • roughly 5.8 millions of optimization steps
  • randomly selected and downloaded 121,827 frames
  • left: original, right: reconstructed

References