TensorSpeech/TensorFlowTTS

Loss becomes nan at 19k steps

Hsn37 opened this issue · 1 comments

Hsn37 commented

I am training tacotron2 from scratch on IPA symbols. the input is entirely based on IPA and the output is Urdu wav files. I used your instructions and preprocessed my dataset according the ljspeech one, using the same format.

Setup:
Cuda: 11.2
cudnn: 8.1
GPU: RTX 3070
Driver: 510.54

The training started off smooth initially, and the loss is decreased steadily. The predictions and alignments also seemed to be evolving and getting better. However, after 19,200 steps, the model loss suddenly became nan. the logs did not show any extra information. So i could not tell what the reason was.

At 19k steps:
5_alignment
15_alignment
b'c213'
b'c412'

then after this, the graphs are empty. nothing to be seen.
b'c412'
11_alignment

Could you please help me with this and where the problem might be? and whether or not this is an issue on my end, or the library.

stale commented

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.