andabi/deep-voice-conversion

librosa broadcast error while running train1

wishvivek opened this issue · 3 comments

@andabi @guang Hello, I'm using the pre-trained model for train1.py uploaded here But, I'm facing an issue while running train1.py. It initially throws some error inside librosa while computing the mel_basis (Filterbank) and eventually goes into training but nothing happens after epoch 1 starts. Please help me, since I'm desperately trying to run this for the last 5 days. Any help is highly appreciated. Thanks!

Software Version:-
Python 2.7
TensorFlow 1.12
Cuda 9.0
CuDNN 7.1.2
librosa 0.5.1
(all others same as in requirements.py)

python train1.py 1 gives me three parts:-

Part 1
Traceback (most recent call last):
File "/home/vivek/anaconda3/envs/voice/lib/python2.7/multiprocessing/process.py", line 267, in _bootstrap
self.run()
File "/home/vivek/anaconda3/envs/voice/lib/python2.7/site-packages/tensorpack/dataflow/parallel.py", line 163, in run
for dp in self.ds:
File "/home/vivek/anaconda3/envs/voice/lib/python2.7/site-packages/tensorpack/dataflow/common.py", line 116, in iter
for data in self.ds:
File "/ssd/home/vivek/voice/deep-voice-conversion/data_load.py", line 37, in get_data
yield get_mfccs_and_phones(wav_file=wav_file)
File "/ssd/home/vivek/voice/deep-voice-conversion/data_load.py", line 78, in get_mfccs_and_phones
hp.default.hop_length)
File "/ssd/home/vivek/voice/deep-voice-conversion/data_load.py", line 149, in _get_mfcc_and_spec
mel_basis = librosa.filters.mel(hp.default.sr, hp.default.n_fft, hp.default.n_mels) # (n_mels, 1+n_fft//2)
File "/home/vivek/anaconda3/envs/voice/lib/python2.7/site-packages/librosa/filters.py", line 247, in mel
lower = -ramps[i] / fdiff[i]
ValueError: operands could not be broadcast together with shapes (1,257) (0,)

Part 2
[0122 19:31:42 @training.py:322] 'sync_variables_from_main_tower' includes 0 operations.
[0122 19:31:42 @model_utils.py:64] Trainable Variables:
name shape dim

net1/prenet/dense1/kernel:0 [40, 128] 5120
net1/prenet/dense1/bias:0 [128] 128
net1/prenet/dense2/kernel:0 [128, 64] 8192
net1/prenet/dense2/bias:0 [64] 64
net1/cbhg/conv1d_banks/num_1/conv1d/conv1d/kernel:0 [1, 64, 64] 4096
net1/cbhg/conv1d_banks/num_1/normalize/beta:0 [64] 64
net1/cbhg/conv1d_banks/num_1/normalize/gamma:0 [64] 64
net1/cbhg/conv1d_banks/num_2/conv1d/conv1d/kernel:0 [2, 64, 64] 8192
net1/cbhg/conv1d_banks/num_2/normalize/beta:0 [64] 64
net1/cbhg/conv1d_banks/num_2/normalize/gamma:0 [64] 64
net1/cbhg/conv1d_banks/num_3/conv1d/conv1d/kernel:0 [3, 64, 64] 12288
net1/cbhg/conv1d_banks/num_3/normalize/beta:0 [64] 64
net1/cbhg/conv1d_banks/num_3/normalize/gamma:0 [64] 64
net1/cbhg/conv1d_banks/num_4/conv1d/conv1d/kernel:0 [4, 64, 64] 16384
net1/cbhg/conv1d_banks/num_4/normalize/beta:0 [64] 64
net1/cbhg/conv1d_banks/num_4/normalize/gamma:0 [64] 64
net1/cbhg/conv1d_banks/num_5/conv1d/conv1d/kernel:0 [5, 64, 64] 20480
net1/cbhg/conv1d_banks/num_5/normalize/beta:0 [64] 64
net1/cbhg/conv1d_banks/num_5/normalize/gamma:0 [64] 64
net1/cbhg/conv1d_banks/num_6/conv1d/conv1d/kernel:0 [6, 64, 64] 24576
net1/cbhg/conv1d_banks/num_6/normalize/beta:0 [64] 64
net1/cbhg/conv1d_banks/num_6/normalize/gamma:0 [64] 64
net1/cbhg/conv1d_banks/num_7/conv1d/conv1d/kernel:0 [7, 64, 64] 28672
net1/cbhg/conv1d_banks/num_7/normalize/beta:0 [64] 64
net1/cbhg/conv1d_banks/num_7/normalize/gamma:0 [64] 64
net1/cbhg/conv1d_banks/num_8/conv1d/conv1d/kernel:0 [8, 64, 64] 32768
net1/cbhg/conv1d_banks/num_8/normalize/beta:0 [64] 64
net1/cbhg/conv1d_banks/num_8/normalize/gamma:0 [64] 64
net1/cbhg/conv1d_1/conv1d/kernel:0 [3, 512, 64] 98304
net1/cbhg/normalize/beta:0 [64] 64
net1/cbhg/normalize/gamma:0 [64] 64
net1/cbhg/conv1d_2/conv1d/kernel:0 [3, 64, 64] 12288
net1/cbhg/highwaynet_0/dense1/kernel:0 [64, 64] 4096
net1/cbhg/highwaynet_0/dense1/bias:0 [64] 64
net1/cbhg/highwaynet_0/dense2/kernel:0 [64, 64] 4096
net1/cbhg/highwaynet_0/dense2/bias:0 [64] 64
net1/cbhg/highwaynet_1/dense1/kernel:0 [64, 64] 4096
net1/cbhg/highwaynet_1/dense1/bias:0 [64] 64
net1/cbhg/highwaynet_1/dense2/kernel:0 [64, 64] 4096
net1/cbhg/highwaynet_1/dense2/bias:0 [64] 64
net1/cbhg/highwaynet_2/dense1/kernel:0 [64, 64] 4096
net1/cbhg/highwaynet_2/dense1/bias:0 [64] 64
net1/cbhg/highwaynet_2/dense2/kernel:0 [64, 64] 4096
net1/cbhg/highwaynet_2/dense2/bias:0 [64] 64
net1/cbhg/highwaynet_3/dense1/kernel:0 [64, 64] 4096
net1/cbhg/highwaynet_3/dense1/bias:0 [64] 64
net1/cbhg/highwaynet_3/dense2/kernel:0 [64, 64] 4096
net1/cbhg/highwaynet_3/dense2/bias:0 [64] 64
net1/cbhg/gru/bidirectional_rnn/fw/gru_cell/gates/kernel:0 [128, 128] 16384
net1/cbhg/gru/bidirectional_rnn/fw/gru_cell/gates/bias:0 [128] 128
net1/cbhg/gru/bidirectional_rnn/fw/gru_cell/candidate/kernel:0 [128, 64] 8192
net1/cbhg/gru/bidirectional_rnn/fw/gru_cell/candidate/bias:0 [64] 64
net1/cbhg/gru/bidirectional_rnn/bw/gru_cell/gates/kernel:0 [128, 128] 16384
net1/cbhg/gru/bidirectional_rnn/bw/gru_cell/gates/bias:0 [128] 128
net1/cbhg/gru/bidirectional_rnn/bw/gru_cell/candidate/kernel:0 [128, 64] 8192
net1/cbhg/gru/bidirectional_rnn/bw/gru_cell/candidate/bias:0 [64] 64
net1/dense/kernel:0 [128, 61] 7808
net1/dense/bias:0 [61] 61
Total #vars=58, #params=363389, size=1.39MB

Part 3
The following variables are in the graph, but not found in the checkpoint: net1/prenet/dense1/kernel:0, and so on
The following variables are in the checkpoint, but not found in the graph: beta1_power:0, beta2_power:0, net/net1/cbhg/conv1d_1/conv1d/kernel:0, and so on

InvalidArgumentError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

tensor_name = global_step; expected dtype int64 does not equal original dtype int32
[[node 140367333287648/RestoreV2 (defined at /home/vivek/anaconda3/envs/voice/lib/python2.7/site-packages/tensorpack/tfutils/sessinit.py:114) = RestoreV2[dtypes=[DT_INT64], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_140367333287648/Const_0_0, 140367333287648/RestoreV2/tensor_names, 140367333287648/RestoreV2/shape_and_slices)]]
[[{{node 140367333287648/RestoreV2/_1}} = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_5_140367333287648/RestoreV2", tensor_type=DT_INT64, _device="/job:localhost/replica:0/task:0/device:GPU:0"]]

Fixed this issue. Librosa 0.5.1 had a problem in the function computing the mel filterbanks. The default pip version with the default joblib 0.13.1 version fixed the issue.

Fixed this issue. Librosa 0.5.1 had a problem in the function computing the mel filterbanks. The default pip version with the default joblib 0.13.1 version fixed the issue.

Brother @wishvivek . I am facing the same issue. How did u fix it?
Please help. It is urgent bro
.

@YashBangera7 Just run a 'pip install librosa' (hopefully it is still 0.6.3), and 'pip install joblib' (0.13.2) I think. I ran these in my Anaconda environment and re-ran train1.py the usual way. It worked without issues. Hope this helps!