BigVGAN

My implementation of BigVGAN-base(paper) for JSUT(link) powerd by lightning.

The differences between HiFi-GAN(my implementation) and this are

Activation is replaced by AntiAliasActivation, which is composed of 2xUpsample, Snake, 2xDownsample, instead of LeakyReLU.
Remove pre-activation of each ConvTranspose1d w.r.t. paper.
Segment size = 32 instead of 64 because VRAM is exhausted.

Usage

Running run.sh will automatically download the data and begin training.
So just execute the following commands to begin training.

cd scripts
./run.sh

synthesize.sh uses last.ckpt by default, so if you want to use a specific weight, change it.

cd scripts
./synthesis.sh

pip install torch torchaudio lightning pandas

Trained 1000 epochs(612000 steps) with batch_size = 16.

Some audio samples are in asset/sample/