단일 화자 모델을 학습하는데 오류납니다ㅠㅠ

Question

단일 화자 모델을 학습하는데 오류납니다ㅠㅠ

HumanKR opened this issue 5 years ago · 1 comments

HumanKR commented 5 years ago

D:\python project\multi-speaker-tacotron-tensorflow>python train.py --data_path=datasets/first
[] MODEL dir: logs\first_2019-12-29_23-16-55
[] PARAM path: logs\first_2019-12-29_23-16-55\params.json
['datasets/first']

[!] Detect non-krbook dataset. May need to set sampling rate from 22050 to 20000

[*] git recv-parse HEAD:
becbd0a

==================================================

[] Checkpoint path: logs\first_2019-12-29_23-16-55\model.ckpt
[] Loading training data from: ['datasets/first\data']
[*] Using model: logs\first_2019-12-29_23-16-55
Hyperparameters:
adam_beta1: 0.9
adam_beta2: 0.999
attention_size: 128
attention_state_size: 256
attention_type: bah_mon
batch_size: 32
cleaners: korean_cleaners
dec_layer_num: 2
dec_prenet_sizes: [256, 128]
dec_rnn_size: 256
decay_learning_rate_mode: 0
dropout_prob: 0.5
embedding_size: 256
enc_bank_channel_size: 128
enc_bank_size: 16
enc_highway_depth: 4
enc_maxpool_width: 2
enc_prenet_sizes: [256, 128]
enc_proj_sizes: [128, 128]
enc_proj_width: 3
enc_rnn_size: 128
frame_length_ms: 50
frame_shift_ms: 12.5
griffin_lim_iters: 60
ignore_recognition_level: 0
initial_data_greedy: True
initial_learning_rate: 0.001
initial_phase_step: 8000
main_data: ['']
main_data_greedy_factor: 0
max_iters: 200
min_iters: 30
min_level_db: -100
min_tokens: 10
model_type: single
num_freq: 1025
num_mels: 80
post_bank_channel_size: 128
post_bank_size: 8
post_highway_depth: 4
post_maxpool_width: 2
post_proj_sizes: [256, 80]
post_proj_width: 3
post_rnn_size: 128
power: 1.5
preemphasis: 0.97
prioritize_loss: False
recognition_loss_coeff: 0.2
reduction_factor: 5
ref_level_db: 20
sample_rate: 22050
skip_inadequate: False
speaker_embedding_size: 16
use_fixed_test_inputs: False
filter_by_min_max_frame_batch: 100%|███████████████████████████████████████████████████| 30/30 [00:11<00:00, 2.66it/s]
[datasets/first\data] Loaded metadata for 14 examples (0.01 hours)
[datasets/first\data] Max length: 394
[datasets/first\data] Min length: 151

{'datasets/first\data': 1.0}

filter_by_min_max_frame_batch: 100%|███████████████████████████████████████████████████| 30/30 [00:10<00:00, 2.81it/s]
[datasets/first\data] Loaded metadata for 14 examples (0.01 hours)
[datasets/first\data] Max length: 394
[datasets/first\data] Min length: 151

{'datasets/first\data': 1.0}

========================================
model_type: single

Initialized Tacotron model. Dimensions:
embedding: 256
speaker embedding: None
prenet out: 128
encoder out: 256
attention out: 256
concat attn & out: 512
decoder cell out: 256
decoder out (5 frames): 400
decoder out (1 frame): 80
postnet out: 256
linear out: 1025

model_type: single

Initialized Tacotron model. Dimensions:
embedding: 256
speaker embedding: None
prenet out: 128
encoder out: 256
attention out: 256
concat attn & out: 512
decoder cell out: 256
decoder out (5 frames): 400
decoder out (1 frame): 80
postnet out: 256
linear out: 1025
2019-12-29 23:17:37.815177: W C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2019-12-29 23:17:37.848527: W C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2019-12-29 23:17:38.259727: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\gpu\gpu_device.cc:955] Found device 0 with properties:
name: GeForce GTX 970
major: 5 minor: 2 memoryClockRate (GHz) 1.329
pciBusID 0000:01:00.0
Total memory: 4.00GiB
Free memory: 3.31GiB
2019-12-29 23:17:38.273441: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\gpu\gpu_device.cc:976] DMA: 0
2019-12-29 23:17:38.278149: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\gpu\gpu_device.cc:986] 0: Y
2019-12-29 23:17:38.283045: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\gpu\gpu_device.cc:1045] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 970, pci bus id: 0000:01:00.0)
Starting new training run at commit: None
Generated 8 batches of size 2 in 0.000 sec
Traceback (most recent call last):
File "D:\python project\multi-speaker-tacotron-tensorflow\datasets\datafeeder.py", line 204, in run
self._enqueue_next_group()
File "D:\python project\multi-speaker-tacotron-tensorflow\datasets\datafeeder.py", line 229, in _enqueue_next_group
for _ in range(int(n * self._batches_per_group // len(self.data_dirs)))]
File "D:\python project\multi-speaker-tacotron-tensorflow\datasets\datafeeder.py", line 229, in
for _ in range(int(n * self._batches_per_group // len(self.data_dirs)))]
File "D:\python project\multi-speaker-tacotron-tensorflow\datasets\datafeeder.py", line 257, in _get_next_example
data_path = data_paths[self._offset[data_dir]]
IndexError: list index out of range

이 이후로 아무 진전 없이 그냥 멈춥니다... 여기저기 돌아다니고 hparams에서 min_tokens를 낮춰도 그러네요...
너무 학습할 데이터가 부족해서 그런걸까요. QAQ

Answer 1 · 2020-05-19T04:48:53.000Z

batch를 가져오는거에서 부터 오류가 나는 것 같은데요, batch_size를 조절해보거나 데이터에 경로에 있는지 확인해보세요!

D:\python project\multi-speaker-tacotron-tensorflow>python train.py --data_path=datasets/first [] MODEL dir: logs\first_2019-12-29_23-16-55 [] PARAM path: logs\first_2019-12-29_23-16-55\params.json ['datasets/first']

[!] Detect non-krbook dataset. May need to set sampling rate from 22050 to 20000

==================================================

{'datasets/first\data': 1.0}

{'datasets/first\data': 1.0}

======================================== model_type: single

Initialized Tacotron model. Dimensions: embedding: 256 speaker embedding: None prenet out: 128 encoder out: 256 attention out: 256 concat attn & out: 512 decoder cell out: 256 decoder out (5 frames): 400 decoder out (1 frame): 80 postnet out: 256 linear out: 1025

model_type: single

D:\python project\multi-speaker-tacotron-tensorflow>python train.py --data_path=datasets/first
[] MODEL dir: logs\first_2019-12-29_23-16-55
[] PARAM path: logs\first_2019-12-29_23-16-55\params.json
['datasets/first']

========================================
model_type: single

Initialized Tacotron model. Dimensions:
embedding: 256
speaker embedding: None
prenet out: 128
encoder out: 256
attention out: 256
concat attn & out: 512
decoder cell out: 256
decoder out (5 frames): 400
decoder out (1 frame): 80
postnet out: 256
linear out: 1025