deepgram/kur

speech.yml prediction is empty

kvinwang opened this issue · 2 comments

I am following the guide and ran kur -v train speech.yml. after 3 epochs, the prediction is still empty. What's wrong?

Epoch 2/inf, loss=308.429: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2432/2432 [4:38:09<00:00,  5.43s/samples]
[INFO 2018-01-31 01:46:06,342 kur.model.executor:439] Training loss: 308.429
Validating, loss=297.191:  94%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████            | 256/271 [07:37<00:32,  2.16s/samples]
[INFO 2018-01-31 01:53:44,197 kur.model.executor:267] Validation loss: 297.191
Prediction: ""
Truth: "in spite of their hard couches the pony riders slept soundly even professor zepplin himself never waking the whole night through"
     Total wall-clock time: 06h 32m 38s
  Training wall-clock time: 06h 17m 05s
Validation wall-clock time: 00h 15m 33s
     Batch wall-clock time: 06h 16m 59s

Epoch 3/inf, loss=307.431: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2432/2432 [3:48:12<00:00,  7.07s/samples]
[INFO 2018-01-31 05:41:57,108 kur.model.executor:439] Training loss: 307.431
Validating, loss=284.447:  94%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████            | 256/271 [07:02<00:27,  1.81s/samples]
[INFO 2018-01-31 05:48:59,855 kur.model.executor:267] Validation loss: 284.447
Prediction: ""
Truth: "lige leaning over the brink was able to follow the boy's movements by the aid of the thin arc of light made by the torch in tad's hand"
     Total wall-clock time: 10h 27m 53s
  Training wall-clock time: 10h 05m 17s
Validation wall-clock time: 00h 22m 36s
     Batch wall-clock time: 10h 05m 08s

If you have a slow GPU like me, and had to reduce the batch size from 16 to 8 or even 4, then keep in mind that the training process will be slower. I had the same problem, but It was not a Kur problem, it was patience instead. I left it training all day long and 5 hours later, at approximately 8-10 epochs I started to see some consonants and spaces as output, not words but something was there!! If you let it train longer it will start outputting some words and looking like the original audio.

After taking a deeper look at the number of samples per epoch, I see you are training on the default dataset (which is very small so you won't see a prediction that will make much sense in the end) Also I notice your computer takes more than 4 hours to complete a single epoch, which is TOO slow. I guess you are training with your CPU. You need to use Tensorflow-gpu with an appropriate GPU, otherwise you will need weeks or even months of training to obtain some intelligible predictions on a bigger dataset.