/QuickVC-VoiceConversion

QuickVC: Any-to-many Voice Conversion Using Inverse Short-time Fourier Transform for Faster Conversion

Primary LanguagePythonMIT LicenseMIT

QuickVC

This repository contains the open source code, audio samples and pretrained models of my paper: QuickVC: Any-to-many Voice Conversion Using Inverse Short-time Fourier Transform for Faster Conversion

Put pretrained model into logs/quickvc

Inference with pretrained model

python convert.py

You can change convert.txt to select the target and source

Preprocess

  1. Hubert-Soft
cd dataset
python encode.py soft dataset/VCTK-16K dataset/VCTK-16K
  1. Spectrogram resize data augumentation, please refer to FreeVC.

Train

python train.py

If you want to change the config and model name, change:

parser.add_argument('-c', '--config', type=str, default="./configs/quickvc.json",help='JSON file for configuration')
parser.add_argument('-m', '--model', type=str,default="quickvc",help='Model name')

in utils.py

In order to use the sr during training, change this part to

i = random.randint(68,92)
c_filename = filename.replace(".wav", f"_{i}.npy")

References

If you have any question about the decoder, refer to MS-ISTFT-VITS.

If you have any question about the Hubert-soft, refer to Soft-VC.

If you have any question about the data augumentation, refer to FreeVC.

If you meet any problem, welcome to contact with me.