CoverGAN is a set of tools and machine learning models designed to generate good-looking album covers based on users' audio tracks and emotions. Resulting covers are generated in vector graphics format (SVG).
Available emotions:
- Anger
- Comfortable
- Fear
- Funny
- Happy
- Inspirational
- Joy
- Lonely
- Nostalgic
- Passionate
- Quiet
- Relaxed
- Romantic
- Sadness
- Serious
- Soulful
- Surprise
- Sweet
- Wary
CoverGAN can be run on a machine without a GPU, however, if a GPU is available, the model will use it.
This tutorial is for Linux.
Install PyTorch. See https://pytorch.org/ for PyTorch install instructions.
Install other python requirements:
pip install -r requirements.txt
Install DiffVG:
- Make sure that CMake is installed. See https://cmake.org/install/ for CMake install instructions.
git clone --recursive https://github.com/BachiLi/diffvg
cd diffvg && python setup.py install
cd .. && rm -rf diffvg
-
Specify PyTorch version to install in
Dockerfile
. -
Build the image:
docker build -t covergan .
- Run the container:
With CUDA enabled:
docker run --rm --network="host" --gpus 1 covergan
Only CPU using:
docker run --rm --network="host" covergan
Below is an example command that can be used to trigger the generation endpoint:
curl --progress-bar \
-F "audio_file=@/home/user/audio.flac" \
"http://localhost:5001/generate?track_artist=Cool%20Band&track_name=Song&emotion=joy" \
-o ./output.json
In protosvg
folder:
-
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
(typeyes
, after1
) -
source $HOME/.cargo/env
-
rustup component add rustfmt
-
cargo install --locked --path .
-
Run as background process the ProtoSVG server by command
protosvg
In covergan
folder:
- Run
python3 ./eval.py \
--audio_file="test.mp3" \
--emotions=joy,relaxed \
--track_artist="Cool Band" \
--track_name="New Song"
- The resulting
.svg
covers by default will be saved to./gen_samples
folder.
Main directory is covergan
.
- Put audio tracks (
.flac
or.mp3
format) to default./audio
folder. - Put original covers (
.jpg
format) to default./clean_covers
folder. - See this help document for more details about specified options.
- Run
./covergan_train.py
file with specified options.
Example:
python3 ./covergan_train.py --emotions emotions.json
- Put original covers (
.jpg
format) in./original_covers
folder. - If the covers title and author name have been already saved to
data.json
file (which for each cover stores the coordinates of the captures and their text color), it should be stored at./checkpoint/caption_dataset/data.json
. - Or else put clean (with captures removed) covers (
.jpg
format) to./clean_covers
folder. - See this help document for more details about specified options.
- Run
./captioner_train.py
file with specified options.
Example:
python3 ./captioner_train.py --clean_covers ./clean_covers
See this folder with simple music tracks and their generated covers.