SoftVC VITS Singing Voice Conversion Fork

A fork of so-vits-svc with realtime support and greatly improved interface. Based on branch 4.0 (v1) and the models are compatible.

Installation

Install this via pip (or your favourite package manager):

pip install -U torch torchaudio --index-url https://download.pytorch.org/whl/cu117
pip install so-vits-svc-fork

Features not available in the original repo

Realtime voice conversion
GUI available
Unified command-line interface (no need to run Python scripts)
Ready to use just by installing with pip.
Automatically download pretrained base model and HuBERT model
Code completely formatted with black, isort, autoflake etc.
Other minor differences

Usage

Inference

GUI

GUI launches with the following command:

svcg

CLI

Realtime (from microphone)

svc vc --model-path <model-path>

File

svc --model-path <model-path> source.wav

Training

Use of Google Colab is recommended. (To train locally, you need at least 12GB of VRAM.)

Google Colab

Local

Place your dataset like dataset_raw/{speaker_id}/{wav_file}.wav and run:

svc pre-resample
svc pre-config
svc pre-hubert
svc train

Further help

For more details, run svc -h or svc <subcommand> -h.

> svc -h
Usage: svc [OPTIONS] COMMAND [ARGS]...

  so-vits-svc allows any folder structure for training data.
  However, the following folder structure is recommended.
      When training: dataset_raw/{speaker_name}/{wav_name}.wav
      When inference: configs/44k/config.json, logs/44k/G_XXXX.pth
  If the folder structure is followed, you DO NOT NEED TO SPECIFY model path, config path, etc.
  (The latest model will be automatically loaded.)
  To train a model, run pre-resample, pre-config, pre-hubert, train.
  To infer a model, run infer.

Options:
  -h, --help  Show this message and exit.

Commands:
  clean          Clean up files, only useful if you are using the default file structure
  infer          Inference
  onnx           Export model to onnx
  pre-config     Preprocessing part 2: config
  pre-hubert     Preprocessing part 3: hubert If the HuBERT model is not found, it will be...
  pre-resample   Preprocessing part 1: resample
  train          Train model If D_0.pth or G_0.pth not found, automatically download from hub.
  train-cluster  Train k-means clustering
  vc             Realtime inference from microphone

Contributors ✨

Thanks goes to these wonderful people (emoji key):

_34j
💻 🤔 📖

This project follows the all-contributors specification. Contributions of any kind welcome!

nishikinoyan/so-vits-svc-fork