This is a repo that allows you to make a video from an audio file, where the video is made up of images generated by a GAN. You can read more about it in my paper, which was accepted to the 4th Workshop on Machine Learning for Creativity and Design at NeurIPS 2020, or in my blog post.
If you tweet things you've generated with this, please use the hashtag #GANterpretation
so I can see the awesome things you do!
To cite this work, please use:
@inproceedings{castro20ganterpretations,
author = {Pablo Samuel Castro},
title = {GANterpretations},
year = {2020},
booktitle = {4th Workshop on Machine Learning for Creativity and Design at NeurIPS 2020},
}
Code for making #GANterpretations is in the src/
subdirectory.
For your convenience, here is a website where you can scan through samples for the 1000 categories in BigGAN.
A sample command for running by specifying an initial set of categories:
python src/run_ganterpreter.py --verbose \
--wav_path=${WAV_PATH} \
--output_dir=${OUTPUT_DIR} \
--inflection_threshold=1e-2 \
--video_file_name=${VIDEO_FILENAME}.avi \
--selected_categories=419,419,419,107,617,127,730,3
You can use this colab notebook to determine what value to use for inflection_threshold
.