/DiffSinger_colab_notebook_MLo7

DiffSinger colab notebook that uses wav and lab as input (htk lab) for ease of use

Primary LanguageJupyter Notebook

DiffSinger_colab_notebook_MLo7

MLo7 DiffSinger training colab notebook an edited copy of Kei's DiffSinger colab notebook

current supported data format:

  • lab + wav (NNSVS format)
  • csv + wav (DiffSinger format)
  • ds + wav (DiffSinger format) broken

Access the notebook here: Open In Colab


GUI note:

python 3.10 was used and is recommended

please run pip install PyYAML tk tqdm requests if you don't have these modules (the are necessary for the gui)


IMPORTANT NOTE:

  • your_speaker_folder's folder name will be used as spk_name so please be careful about your file naming
  • colab notebook primarily uses python; thus space in file name or folder path may be invalid
  • for an in-depth guide for SVS training and/or labeling, please see SVS Singing Voice Database - Tutorial

This notebook converts your data (lab + wav) to compatible format via nnsvs-db-converter

It is advised to edit your data using SlurCutter for a more refined data for your pitch model

Zip file format example:

#single speaker (lab + wav | ds + wav)
your_zip.zip:
    |
    |
    your_speaker_folder:
        |
        |
        data_1.wav
        data_1.lab (or.ds)
        .
        data_2.wav
        data_2.lab (or.ds)
        .
        data_3.wav
        data_3.lab (or.ds)
        .
        ...
#single speaker (csv + wav)
your_zip.zip:
    |
    |
    your_speaker_folder:
        |
        |
        wavs (folder named "wavs" containing all the wavs)
        .
        transcriptions.csv
#multi speaker (lab + wav | ds + wav)
your_zip.zip:
    |
    |
    your_speaker_folder_1:
        |
        |
        data_1.wav
        data_1.lab (or.ds)
        .
        data_2.wav
        data_2.lab (or.ds)
        .
        data_3.wav
        data_3.lab (or.ds)
        .
        ...
    your_speaker_folder_2:
        |
        |
        data_1.wav
        data_1.lab (or.ds)
        .
        data_2.wav
        data_2.lab (or.ds)
        .
        data_3.wav
        data_3.lab (or.ds)
        .
        ...
#multi speaker (csv + wav)
your_zip.zip:
    |
    |
    your_speaker_folder_1:
        |
        |
        wavs (folder named "wavs" containing all the wavs)
        .
        transcriptions.csv
    your_speaker_folder_2:
        |
        |
        wavs (folder named "wavs" containing all the wavs)
        .
        transcriptions.csv


Plans (update might not be in order):

  • [script] add onnx exporter to ds_gui.py
  • [jupyter] overhaul ou build section
  • [jupyter] make NSF-HiFiGAN vocoder training notebook

Credits:

  • openvpi for DiffSinger fork and more

  • UtaUtaUtau for nnsvs-db-converter

  • Kei for the original notebook

  • MLo7 for the notebook edit

  • PixPrucer for an in-depth SVS guide

  • haru0l for the base pretrain with embeds


Extra Note:

Wow you made it to the very bottom.... Why though lmao hahahahhshahhasdksajidhasjl

Feel free to suggest or ask any question via discord my user display name is MLo7 and my user name is ghin_mlo7