The quickest way to start experimenting is to use this model trained on this smaller dataset. An arbitrary dataset can be constructed by downloading frames using the scripts provided here.
See wiki for more details, download links and further results.
python train.py --config ../configs/train.yaml
Example train.yaml
:
exp_name: training
num_epochs: 1
batch_size: 16
learning_rate: 0.0001
# start fresh
resume: false
checkpoint: null
start_epoch: 1
batch_every: 1
save_every: 10
epoch_every: 1
shuffle: true
dataset_path: datasets/yt_small_720p
num_workers: 2
device: cuda
Given a trained model (checkpoint
), perform inference on images @ dataset_path
.
python test.py --config ../configs/test.yaml
Example test.yaml
:
exp_name: testing
checkpoint: model.state
batch_every: 100
shuffle: true
dataset_path: datasets/testing
num_workers: 1
device: cuda
Run detect.py
with the following arguments:
optional arguments:
-h, --help show this help message and exit
--config CONFIG
--image IMAGE
--size
--patch PATCH
--resize
For example, this command: python3 detect.py --config ../configs/test.yaml --image test.jpg --patch 512
will run the 32x32x32
CAE model on test.jpg
with a patch-size of 512. --resize
means that the patchsize input will be resized to 128x128. Without --resize
, the network will grow to fit the input size, which increases the memory usage and limiting speed. The bottleneck feature size will be the same.
Note: Currently, smoothing (i.e. linear interpolation in smoothing.py
) is used in order to account for the between-patches noisy areas due to padding (this still needs further investigation).
cae_32x32x32_zero_pad_bin
model- roughly 5.8 millions of optimization steps
- randomly selected and downloaded 121,827 frames
- left: original, right: reconstructed