w-okada/beatrice-trainer-colab

An error occurs when using google colab: NameError: name 'data_iter' is not defined

Opened this issue · 4 comments

An error occurs at the training stage:

2024-07-30 21:58:54.319067: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered

2024-07-30 21:58:54.319124: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-07-30 21:58:54.320265: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-07-30 21:58:54.327118: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-07-30 21:58:55.905210: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
device=cuda
config:
{'adam_betas': [0.8, 0.99],
'adam_eps': 1e-06,
'batch_size': 8,
'grad_balancer_ema_decay': 0.995,
'grad_weight_adv': 1.0,
'grad_weight_fm': 1.0,
'grad_weight_mel': 1.0,
'hidden_channels': 256,
'in_ir_wav_dir': 'assets/ir',
'in_noise_wav_dir': 'assets/noise',
'in_sample_rate': 16000,
'in_test_wav_dir': 'assets/test',
'learning_rate': 0.0001,
'min_learning_rate': 5e-06,
'n_steps': 20000,
'num_workers': 16,
'out_sample_rate': 24000,
'phone_extractor_file': 'assets/pretrained/003b_checkpoint_03000000.pt',
'pitch_estimator_file': 'assets/pretrained/008_1_checkpoint_00300000.pt',
'pretrained_file': 'assets/pretrained/040c_checkpoint_libritts_r_200_02300000.pt',
'san': False,
'segment_length': 100,
'use_amp': True,
'warmup_steps': 10000,
'wav_length': 96000}

n_speakers=1
0: DevkaTrap

len(training_filelist)=1
len(test_filelist)=8
/content/beatrice-trainer/assets/test/common_voice_ja_38833628_16k.wav, [0]
/content/beatrice-trainer/assets/test/common_voice_ja_38843402_16k.wav, [0]
/content/beatrice-trainer/assets/test/common_voice_ja_38852485_16k.wav, [0]
/content/beatrice-trainer/assets/test/common_voice_ja_38853932_16k.wav, [0]
/content/beatrice-trainer/assets/test/common_voice_ja_38864552_16k.wav, [0]
/content/beatrice-trainer/assets/test/common_voice_ja_38878413_16k.wav, [0]
/content/beatrice-trainer/assets/test/common_voice_ja_38898180_16k.wav, [0]
/content/beatrice-trainer/assets/test/common_voice_ja_38925334_16k.wav, [0]

Computing mean F0s of target speakers...
0: 218.2Hz,
Done.
Computing pitch shifts for test files...
100% 8/8 [00:23<00:00, 2.98s/it]
Done.
/usr/local/lib/python3.10/dist-packages/torch/nn/utils/weight_norm.py:28: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.
warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.")




/usr/local/lib/python3.10/dist-packages/torch/optim/lr_scheduler.py:143: UserWarning: Detected call of lr_scheduler.step() before optimizer.step(). In PyTorch 1.1.0 and later, you should call them in the opposite order: optimizer.step() before lr_scheduler.step(). Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
warnings.warn("Detected call of lr_scheduler.step() before optimizer.step(). "
Downloading: "https://github.com/tarepan/SpeechMOS/zipball/v1.0.0" to /root/.cache/torch/hub/v1.0.0.zip
Downloading: "https://github.com/tarepan/SpeechMOS/releases/download/v1.0.0/utmos22_strong_step7459_v1.pt" to /root/.cache/torch/hub/checkpoints/utmos22_strong_step7459_v1.pt
100% 392M/392M [00:05<00:00, 71.3MB/s]
/usr/local/lib/python3.10/dist-packages/torch/nn/utils/weight_norm.py:28: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.
warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.")
0% 0/20000 [00:00<?, ?it/s]/usr/lib/python3.10/multiprocessing/popen_fork.py:66: RuntimeWarning: os.fork() was called. os.fork() is incompatible with multithreaded code, and JAX is multithreaded, so this will likely lead to a deadlock.
self.pid = os.fork()
0% 0/20000 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/content/beatrice-trainer/beatrice_trainer/main.py", line 2632, in
batch = next(data_iter)
NameError: name 'data_iter' is not defined

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/content/beatrice-trainer/beatrice_trainer/main.py", line 2635, in
batch = next(data_iter)
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 631, in next
data = self._next_data()
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 1319, in _next_data
raise StopIteration
StopIteration`


apparently there is an error somewhere in this part of the code:

for iteration in tqdm(range(initial_iteration, h.n_steps)):
# === 1. データ前処理 ===
try:
batch = next(data_iter)
except:
data_iter = iter(training_loader)
batch = next(data_iter)
(
clean_wavs,
noisy_wavs_16k,
slice_starts,
speaker_ids,
formant_shift_semitone,
) = map(lambda x: x.to(device, non_blocking=True), batch)

meetoo

experiencing the same issue

Change your dataset to mono
If you're getting errors when you make changes, adjust the length of your dataset

Thanks, man, this really helped me out, would you like to collaborate and build something amazing?