erew123/alltalk_tts

Error in Step 2 Fine tuning.

mercuryyy opened this issue · 3 comments

Getting an error in step 2 of fine tuning.

I am also going to post Step 1 info for debugging.

my training wav file is 2 min and 22 seconds long.

After step 1 finishes in the tmp-trn/wavs i only see 1 file "anelvoice_00000000.wav"
I thought it would split the file into many parts"

Anyway step 1 completes and then i get an error in step 2.

Step 1:

[FINETUNE] �[94mPart of AllTalk�[0m https://github.com/erew123/alltalk_tts/
[FINETUNE] �[94mCoqui Public Model License�[0m
[FINETUNE] �[94mhttps://coqui.ai/cpml.txt�[0m
[FINETUNE] �[94mWhisper model: �[92mlarge-v2 �[94mLanguage: �[92men �[94mEvaluation data percentage: �[92m15.0%�[0m
[FINETUNE] �[94mStarting Step 1�[0m - Preparing Audio/Generating the dataset
[FINETUNE] Updated lang.txt with the target language.
[FINETUNE] Loading Whisper Model: large-v2
[FINETUNE] Model will be downloaded if its not available, which will take a few minutes.
[FINETUNE] Current working file: /home/izzy/alltalk/alltalk2/alltalk_tts/finetune/put-voice-samples-in-here/anelvoice.wav
[FINETUNE] Processing audio with duration 02:22.571
[FINETUNE] VAD filter removed 00:05.072 of audio
[FINETUNE] Train CSV: /home/izzy/alltalk/alltalk2/alltalk_tts/finetune/tmp-trn/metadata_train.csv
[FINETUNE] Eval CSV: /home/izzy/alltalk/alltalk2/alltalk_tts/finetune/tmp-trn/metadata_eval.csv
[FINETUNE] Audio Total: 142.5705442176871
[FINETUNE] Dataset Generated. Move to Step 2

Running on local URL:  http://127.0.0.1:7052

To create a public link, set `share=True` in `launch()`.
IMPORTANT: You are using gradio version 3.50.2, however version 4.29.0 is available, please upgrade.
--------
[FINETUNE] �[94mPart of AllTalk�[0m https://github.com/erew123/alltalk_tts/
[FINETUNE] �[94mCoqui Public Model License�[0m
[FINETUNE] �[94mhttps://coqui.ai/cpml.txt�[0m
[FINETUNE] �[94mWhisper model: �[92mlarge-v2 �[94mLanguage: �[92men �[94mEvaluation data percentage: �[92m15.0%�[0m
[FINETUNE] �[94mStarting Step 1�[0m - Preparing Audio/Generating the dataset
[FINETUNE] Updated lang.txt with the target language.
[FINETUNE] Loading Whisper Model: large-v2
[FINETUNE] Model will be downloaded if its not available, which will take a few minutes.
[FINETUNE] Current working file: /home/izzy/alltalk/alltalk2/alltalk_tts/finetune/put-voice-samples-in-here/anelvoice.wav
[FINETUNE] Processing audio with duration 02:22.571
[FINETUNE] VAD filter removed 00:05.072 of audio
[FINETUNE] Train CSV: /home/izzy/alltalk/alltalk2/alltalk_tts/finetune/tmp-trn/metadata_train.csv
[FINETUNE] Eval CSV: /home/izzy/alltalk/alltalk2/alltalk_tts/finetune/tmp-trn/metadata_eval.csv
[FINETUNE] Audio Total: 142.5705442176871
[FINETUNE] Dataset Generated. Move to Step 2
[FINETUNE] �[94mStarting Step 2�[0m - Fine-tuning the XTTS Encoder
[FINETUNE] �[94mLanguage: �[92men �[94mEpochs: �[92m10 �[94mBatch size: �[92m4�[0m �[94mGrad accumulation steps: �[92m1�[0m
[FINETUNE] �[94mTraining   : �[92m/home/izzy/alltalk/alltalk2/alltalk_tts/finetune/tmp-trn/metadata_train.csv�[0m
[FINETUNE] �[94mEvaluation : �[92m/home/izzy/alltalk/alltalk2/alltalk_tts/finetune/tmp-trn/metadata_eval.csv�[0m
[FINETUNE] �[94mAvailable VRAM: �[92m23.53 GB�[0m
[FINETUNE] Starting finetuning on �[92mBase Model�[0m
>> DVAE weights restored from: /home/izzy/alltalk/alltalk2/alltalk_tts/models/xttsv2_2.0.2/dvae.pth
 | > Found 1 files in /home/izzy/alltalk/alltalk2/alltalk_tts/finetune/tmp-trn
 > Sampling by language: dict_keys(['en'])
Traceback (most recent call last):
  File "/home/izzy/alltalk/alltalk2/alltalk_tts/alltalk_environment/env/lib/python3.11/site-packages/trainer/trainer.py", line 1833, in fit
    self._fit()
  File "/home/izzy/alltalk/alltalk2/alltalk_tts/alltalk_environment/env/lib/python3.11/site-packages/trainer/trainer.py", line 1785, in _fit
    self.train_epoch()
  File "/home/izzy/alltalk/alltalk2/alltalk_tts/alltalk_environment/env/lib/python3.11/site-packages/trainer/trainer.py", line 1503, in train_epoch
    for cur_step, batch in enumerate(self.train_loader):
  File "/home/izzy/alltalk/alltalk2/alltalk_tts/alltalk_environment/env/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 631, in __next__
    data = self._next_data()
           ^^^^^^^^^^^^^^^^^
  File "/home/izzy/alltalk/alltalk2/alltalk_tts/alltalk_environment/env/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 1346, in _next_data
    return self._process_data(data)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/izzy/alltalk/alltalk2/alltalk_tts/alltalk_environment/env/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 1372, in _process_data
    data.reraise()
  File "/home/izzy/alltalk/alltalk2/alltalk_tts/alltalk_environment/env/lib/python3.11/site-packages/torch/_utils.py", line 705, in reraise
    raise exception
RecursionError: Caught RecursionError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/home/izzy/alltalk/alltalk2/alltalk_tts/alltalk_environment/env/lib/python3.11/site-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop
    data = fetcher.fetch(index)  # type: ignore[possibly-undefined]
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/izzy/alltalk/alltalk2/alltalk_tts/alltalk_environment/env/lib/python3.11/site-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/izzy/alltalk/alltalk2/alltalk_tts/alltalk_environment/env/lib/python3.11/site-packages/torch/utils/data/_utils/fetch.py", line 51, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
            ~~~~~~~~~~~~^^^^^
  File "/home/izzy/alltalk/alltalk2/alltalk_tts/alltalk_environment/env/lib/python3.11/site-packages/TTS/tts/layers/xtts/trainer/dataset.py", line 180, in __getitem__
    return self[1]
           ~~~~^^^
  File "/home/izzy/alltalk/alltalk2/alltalk_tts/alltalk_environment/env/lib/python3.11/site-packages/TTS/tts/layers/xtts/trainer/dataset.py", line 156, in __getitem__
    return self[1]
           ~~~~^^^
  File "/home/izzy/alltalk/alltalk2/alltalk_tts/alltalk_environment/env/lib/python3.11/site-packages/TTS/tts/layers/xtts/trainer/dataset.py", line 156, in __getitem__
    return self[1]
           ~~~~^^^
  File "/home/izzy/alltalk/alltalk2/alltalk_tts/alltalk_environment/env/lib/python3.11/site-packages/TTS/tts/layers/xtts/trainer/dataset.py", line 156, in __getitem__
    return self[1]
           ~~~~^^^
  [Previous line repeated 967 more times]
  File "/home/izzy/alltalk/alltalk2/alltalk_tts/alltalk_environment/env/lib/python3.11/site-packages/TTS/tts/layers/xtts/trainer/dataset.py", line 146, in __getitem__
    index = random.randint(0, len(self.samples[lang]) - 1)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/izzy/alltalk/alltalk2/alltalk_tts/alltalk_environment/env/lib/python3.11/random.py", line 362, in randint
    return self.randrange(a, b+1)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/izzy/alltalk/alltalk2/alltalk_tts/alltalk_environment/env/lib/python3.11/random.py", line 344, in randrange
    return istart + self._randbelow(width)
                    ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/izzy/alltalk/alltalk2/alltalk_tts/alltalk_environment/env/lib/python3.11/random.py", line 239, in _randbelow_with_getrandbits
    k = n.bit_length()  # don't use (n-1) here because n can be 1
        ^^^^^^^^^^^^^^
RecursionError: maximum recursion depth exceeded while calling a Python object

Traceback (most recent call last):
  File "/home/izzy/alltalk/alltalk2/alltalk_tts/alltalk_environment/env/lib/python3.11/site-packages/trainer/trainer.py", line 1833, in fit
    self._fit()
  File "/home/izzy/alltalk/alltalk2/alltalk_tts/alltalk_environment/env/lib/python3.11/site-packages/trainer/trainer.py", line 1785, in _fit
    self.train_epoch()
  File "/home/izzy/alltalk/alltalk2/alltalk_tts/alltalk_environment/env/lib/python3.11/site-packages/trainer/trainer.py", line 1503, in train_epoch
    for cur_step, batch in enumerate(self.train_loader):
  File "/home/izzy/alltalk/alltalk2/alltalk_tts/alltalk_environment/env/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 631, in __next__
    data = self._next_data()
           ^^^^^^^^^^^^^^^^^
  File "/home/izzy/alltalk/alltalk2/alltalk_tts/alltalk_environment/env/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 1346, in _next_data
    return self._process_data(data)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/izzy/alltalk/alltalk2/alltalk_tts/alltalk_environment/env/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 1372, in _process_data
    data.reraise()
  File "/home/izzy/alltalk/alltalk2/alltalk_tts/alltalk_environment/env/lib/python3.11/site-packages/torch/_utils.py", line 705, in reraise
    raise exception
RecursionError: Caught RecursionError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/home/izzy/alltalk/alltalk2/alltalk_tts/alltalk_environment/env/lib/python3.11/site-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop
    data = fetcher.fetch(index)  # type: ignore[possibly-undefined]
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/izzy/alltalk/alltalk2/alltalk_tts/alltalk_environment/env/lib/python3.11/site-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/izzy/alltalk/alltalk2/alltalk_tts/alltalk_environment/env/lib/python3.11/site-packages/torch/utils/data/_utils/fetch.py", line 51, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
            ~~~~~~~~~~~~^^^^^
  File "/home/izzy/alltalk/alltalk2/alltalk_tts/alltalk_environment/env/lib/python3.11/site-packages/TTS/tts/layers/xtts/trainer/dataset.py", line 180, in __getitem__
    return self[1]
           ~~~~^^^
  File "/home/izzy/alltalk/alltalk2/alltalk_tts/alltalk_environment/env/lib/python3.11/site-packages/TTS/tts/layers/xtts/trainer/dataset.py", line 156, in __getitem__
    return self[1]
           ~~~~^^^
  File "/home/izzy/alltalk/alltalk2/alltalk_tts/alltalk_environment/env/lib/python3.11/site-packages/TTS/tts/layers/xtts/trainer/dataset.py", line 156, in __getitem__
    return self[1]
           ~~~~^^^
  File "/home/izzy/alltalk/alltalk2/alltalk_tts/alltalk_environment/env/lib/python3.11/site-packages/TTS/tts/layers/xtts/trainer/dataset.py", line 156, in __getitem__
    return self[1]
           ~~~~^^^
  [Previous line repeated 967 more times]
  File "/home/izzy/alltalk/alltalk2/alltalk_tts/alltalk_environment/env/lib/python3.11/site-packages/TTS/tts/layers/xtts/trainer/dataset.py", line 146, in __getitem__
    index = random.randint(0, len(self.samples[lang]) - 1)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/izzy/alltalk/alltalk2/alltalk_tts/alltalk_environment/env/lib/python3.11/random.py", line 362, in randint
    return self.randrange(a, b+1)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/izzy/alltalk/alltalk2/alltalk_tts/alltalk_environment/env/lib/python3.11/random.py", line 344, in randrange
    return istart + self._randbelow(width)
                    ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/izzy/alltalk/alltalk2/alltalk_tts/alltalk_environment/env/lib/python3.11/random.py", line 239, in _randbelow_with_getrandbits
    k = n.bit_length()  # don't use (n-1) here because n can be 1
        ^^^^^^^^^^^^^^
RecursionError: maximum recursion depth exceeded while calling a Python object


During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/izzy/alltalk/alltalk2/alltalk_tts/finetune.py", line 1523, in train_model
    config_path, original_xtts_checkpoint, vocab_file, exp_path, speaker_wav = train_gpt(language, num_epochs, batch_size, grad_acumm, train_csv, eval_csv, learning_rate, output_path=str(output_path), max_audio_length=max_audio_length)
                                                                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/izzy/alltalk/alltalk2/alltalk_tts/finetune.py", line 691, in train_gpt
    trainer.fit()
  File "/home/izzy/alltalk/alltalk2/alltalk_tts/alltalk_environment/env/lib/python3.11/site-packages/trainer/trainer.py", line 1862, in fit
    sys.exit(1)
SystemExit: 1

Here is my diagnostics file:

GPU Information: Mon Jun 3 09:22:50 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 555.42.02 Driver Version: 555.42.02 CUDA Version: 12.5 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 4090 Off | 00000000:01:00.0 Off | Off |
| 0% 41C P8 48W / 480W | 240MiB / 24564MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 2410 G /usr/lib/xorg/Xorg 176MiB |
| 0 N/A N/A 2535 G /usr/bin/gnome-shell 47MiB |
+-----------------------------------------------------------------------------------------+

Port Status: Port 7851 is available.

CUDA Working: Success - CUDA is available and working.
CUDA_HOME: /usr/local/cuda
Cublas64_11 Path: /home/izzy/alltalk/alltalk2/alltalk_tts/alltalk_environment/env/lib/python3.11/site-packages/nvidia/cublas/lib/libcublas.so.11

Torch Version: 2.3.0+cu121
Python Version: 3.11.9
Python Executable: /home/izzy/alltalk/alltalk2/alltalk_tts/alltalk_environment/env/bin/python
Conda Environment: /home/izzy/alltalk/alltalk2/alltalk_tts/alltalk_environment/env

Python Search Path:
/home/izzy/alltalk/alltalk2/alltalk_tts
/home/izzy/alltalk/alltalk2/alltalk_tts/alltalk_environment/env/lib/python311.zip
/home/izzy/alltalk/alltalk2/alltalk_tts/alltalk_environment/env/lib/python3.11
/home/izzy/alltalk/alltalk2/alltalk_tts/alltalk_environment/env/lib/python3.11/lib-dynload
/home/izzy/alltalk/alltalk2/alltalk_tts/alltalk_environment/env/lib/python3.11/site-packages

Requirements file package comparison:
absl-py Required: == 2.1.0 Installed: 2.1.0
aiofiles Required: == 23.2.1 Installed: 23.2.1
aiohttp Required: == 3.9.3 Installed: 3.9.3
aiosignal Required: == 1.3.1 Installed: 1.3.1
altair Required: == 5.2.0 Installed: 5.2.0
annotated-types Required: == 0.6.0 Installed: 0.6.0
anyascii Required: == 0.3.2 Installed: 0.3.2
anyio Required: == 4.3.0 Installed: 4.3.0
attrs Required: == 23.2.0 Installed: 23.2.0
audioread Required: == 3.0.1 Installed: 3.0.1
av Required: == 11.0.0 Installed: 11.0.0
Babel Required: == 2.14.0 Installed: 2.14.0
bangla Required: == 0.0.2 Installed: 0.0.2
blinker Required: == 1.7.0 Installed: 1.7.0
blis Required: == 0.7.11 Installed: 0.7.11
bnnumerizer Required: == 0.0.2 Installed: 0.0.2
bnunicodenormalizer Required: == 0.1.6 Installed: 0.1.6
catalogue Required: == 2.0.10 Installed: 2.0.10
certifi Required: == 2024.2.2 Installed: 2024.2.2
cffi Required: == 1.16.0 Installed: 1.16.0
charset-normalizer Required: == 3.3.2 Installed: 3.3.2
click Required: == 8.1.7 Installed: 8.1.7
cloudpathlib Required: == 0.16.0 Installed: 0.16.0
colorama Required: == 0.4.6 Installed: 0.4.6
coloredlogs Required: == 15.0.1 Installed: 15.0.1
confection Required: == 0.1.4 Installed: 0.1.4
contourpy Required: == 1.2.0 Installed: 1.2.0
coqpit Required: == 0.0.17 Installed: 0.0.17
ctranslate2 Required: == 4.1.0 Installed: 4.1.0
cutlet Required: == 0.4.0 Installed: 0.4.0
cycler Required: == 0.12.1 Installed: 0.12.1
cymem Required: == 2.0.8 Installed: 2.0.8
Cython Required: == 3.0.9 Installed: 3.0.9
dateparser Required: == 1.1.8 Installed: 1.1.8
decorator Required: == 5.1.1 Installed: 5.1.1
docopt Required: == 0.6.2 Installed: 0.6.2
einops Required: == 0.7.0 Installed: 0.7.0
encodec Required: == 0.1.1 Installed: 0.1.1
fastapi Required: == 0.110.0 Installed: 0.110.0
faster-whisper Required: == 1.0.1 Installed: 1.0.1
ffmpy Required: == 0.3.2 Installed: 0.3.2
filelock Required: == 3.13.3 Installed: 3.13.3
Flask Required: == 3.0.2 Installed: 3.0.2
flatbuffers Required: == 24.3.25 Installed: 24.3.25
fonttools Required: == 4.50.0 Installed: 4.50.0
frozenlist Required: == 1.4.1 Installed: 1.4.1
fsspec Required: == 2024.3.1 Installed: 2024.3.1
fugashi Required: == 1.3.1 Installed: 1.3.1
fuzzywuzzy Required: >= 0.18.0 Installed: 0.18.0
g2pkk Required: == 0.1.2 Installed: 0.1.2
gradio Required: == 3.50.2 Installed: 3.50.2
gradio_client Required: == 0.6.1 Installed: 0.6.1
grpcio Required: == 1.62.1 Installed: 1.62.1
gruut Required: == 2.2.3 Installed: 2.2.3
gruut-ipa Required: == 0.13.0 Installed: 0.13.0
h11 Required: == 0.14.0 Installed: 0.14.0
hangul-romanize Required: == 0.1.0 Installed: 0.1.0
httpcore Required: == 1.0.4 Installed: 1.0.4
httpx Required: == 0.27.0 Installed: 0.27.0
huggingface-hub Required: == 0.22.1 Installed: 0.22.1
humanfriendly Required: == 10.0 Installed: 10.0
idna Required: == 3.6 Installed: 3.6
importlib_metadata Required: == 7.1.0 Installed: 7.1.0
importlib_resources Required: == 6.4.0 Installed: 6.4.0
inflect Required: == 7.0.0 Installed: 7.0.0
itsdangerous Required: == 2.1.2 Installed: 2.1.2
jaconv Required: == 0.3.4 Installed: 0.3.4
jamo Required: == 0.4.1 Installed: 0.4.1
jieba Required: == 0.42.1 Installed: 0.42.1
Jinja2 Required: == 3.1.3 Installed: 3.1.3
joblib Required: == 1.3.2 Installed: 1.3.2
jsonlines Required: == 1.2.0 Installed: 1.2.0
jsonschema Required: == 4.21.1 Installed: 4.21.1
jsonschema-specifications Required: == 2023.12.1 Installed: 2023.12.1
kiwisolver Required: == 1.4.5 Installed: 1.4.5
langcodes Required: == 3.3.0 Installed: 3.3.0
lazy_loader Required: == 0.3 Installed: 0.3
librosa Required: == 0.10.1 Installed: 0.10.1
llvmlite Required: == 0.42.0 Installed: 0.42.0
Markdown Required: == 3.6 Installed: 3.6
MarkupSafe Required: == 2.1.5 Installed: 2.1.5
matplotlib Required: == 3.8.3 Installed: 3.8.3
mojimoji Required: == 0.0.13 Installed: 0.0.13
mpmath Required: == 1.3.0 Installed: 1.3.0
msgpack Required: == 1.0.8 Installed: 1.0.8
multidict Required: == 6.0.5 Installed: 6.0.5
murmurhash Required: == 1.0.10 Installed: 1.0.10
networkx Required: == 2.8.8 Installed: 2.8.8
nltk Required: == 3.8.1 Installed: 3.8.1
num2words Required: == 0.5.13 Installed: 0.5.13
numba Required: == 0.59.1 Installed: 0.59.1
numpy Required: == 1.26.4 Installed: 1.26.4
nvidia-cublas-cu11 Required: >= 11.11.3.6 Installed: 11.11.3.6
nvidia-cudnn-cu11 Required: >= 9.0.0.312 Installed: 9.1.1.17
onnxruntime Required: == 1.17.1 Installed: 1.17.1
orjson Required: == 3.9.15 Installed: 3.9.15
packaging Required: == 24.0 Installed: 24.0
pandas Required: == 1.5.3 Installed: 1.5.3
pillow Required: == 10.2.0 Installed: 10.2.0
platformdirs Required: == 4.2.0 Installed: 4.2.0
pooch Required: == 1.8.1 Installed: 1.8.1
preshed Required: == 3.0.9 Installed: 3.0.9
protobuf Required: == 5.26.0 Installed: 5.26.0
psutil Required: == 5.9.8 Installed: 5.9.8
pycparser Required: == 2.21 Installed: 2.21
pydantic Required: == 2.6.4 Installed: 2.6.4
pydantic_core Required: == 2.16.3 Installed: 2.16.3
pydub Required: == 0.25.1 Installed: 0.25.1
pynndescent Required: == 0.5.11 Installed: 0.5.11
pyparsing Required: == 3.1.2 Installed: 3.1.2
pypinyin Required: == 0.51.0 Installed: 0.51.0
pyreadline3 Required: == 3.4.1 Installed: 3.4.1
pysbd Required: == 0.3.4 Installed: 0.3.4
python-crfsuite Required: == 0.9.10 Installed: 0.9.10
python-dateutil Required: == 2.9.0.post0 Installed: 2.9.0.post0
python-Levenshtein Required: >= 0.25.0 Installed: 0.25.1
python-multipart Required: == 0.0.9 Installed: 0.0.9
pytz Required: == 2024.1 Installed: 2024.1
PyYAML Required: == 6.0.1 Installed: 6.0.1
referencing Required: == 0.34.0 Installed: 0.34.0
regex Required: == 2023.12.25 Installed: 2023.12.25
requests Required: == 2.31.0 Installed: 2.31.0
rpds-py Required: == 0.18.0 Installed: 0.18.0
safetensors Required: == 0.4.2 Installed: 0.4.2
scikit-learn Required: == 1.4.1.post1 Installed: 1.4.1.post1
scipy Required: == 1.12.0 Installed: 1.12.0
semantic-version Required: == 2.10.0 Installed: 2.10.0
six Required: == 1.16.0 Installed: 1.16.0
smart-open Required: == 6.4.0 Installed: 6.4.0
sniffio Required: == 1.3.1 Installed: 1.3.1
sounddevice Required: == 0.4.6 Installed: 0.4.6
soundfile Required: == 0.12.1 Installed: 0.12.1
soxr Required: == 0.3.7 Installed: 0.3.7
spacy Required: == 3.7.4 Installed: 3.7.4
spacy-legacy Required: == 3.0.12 Installed: 3.0.12
spacy-loggers Required: == 1.0.5 Installed: 1.0.5
srsly Required: == 2.4.8 Installed: 2.4.8
starlette Required: == 0.36.3 Installed: 0.36.3
SudachiDict-core Required: == 20240109 Installed: 20240109
SudachiPy Required: == 0.6.8 Installed: 0.6.8
sympy Required: == 1.12 Installed: 1.12
tensorboard Required: == 2.16.2 Installed: 2.16.2
tensorboard-data-server Required: == 0.7.2 Installed: 0.7.2
thinc Required: == 8.2.3 Installed: 8.2.3
threadpoolctl Required: == 3.4.0 Installed: 3.4.0
tokenizers Required: == 0.15.2 Installed: 0.15.2
toolz Required: == 0.12.1 Installed: 0.12.1
torch Required: >= 2.2.0 Installed: 2.3.0+cu121
torchaudio Required: >= 2.2.0 Installed: 2.3.0+cu121
tqdm Required: == 4.66.2 Installed: 4.66.2
trainer Required: == 0.0.36 Installed: 0.0.36
transformers Required: == 4.39.1 Installed: 4.39.1
TTS Required: == 0.22.0 Installed: 0.22.0
typer Required: == 0.9.4 Installed: 0.9.4
typing_extensions Required: == 4.10.0 Installed: 4.10.0
tzdata Required: == 2024.1 Installed: 2024.1
tzlocal Required: == 5.2 Installed: 5.2
umap-learn Required: == 0.5.5 Installed: 0.5.5
Unidecode Required: == 1.3.8 Installed: 1.3.8
unidic-lite Required: == 1.0.8 Installed: 1.0.8
urllib3 Required: == 2.2.1 Installed: 2.2.1
uvicorn Required: == 0.29.0 Installed: 0.29.0
wasabi Required: == 1.1.2 Installed: 1.1.2
weasel Required: == 0.3.4 Installed: 0.3.4
websockets Required: == 11.0.3 Installed: 11.0.3
Werkzeug Required: == 3.0.1 Installed: 3.0.1
yarl Required: == 1.9.4 Installed: 1.9.4
zipp Required: == 3.18.1 Installed: 3.18.1

Hi @mercuryyy

Id be pretty confident it couldn't split your wav file down properly. With the file being so small, I cant recall if you split it into 2x files before running Step 1, if it will accept that or not (too many revisions of code to think through to recall exactly).

However in the next release (well the BETA) Ive added extra code to force files to be split in the situation where it struggles with doing this. Hopefully I will have that out within the next 24-48 hours.

For now, you could try splitting the audio in to two parts with Audacity and see if that clears it.

image

Thanks

thank you got it working by manually splitting the files.

Cant wait for the v2 Beta hope you release soon :)