retrain existing nnet3 model with more data

Question

retrain existing nnet3 model with more data

cogmeta opened this issue 5 years ago · 3 comments

Is there a way to retrain existing nnet3 mode with additional data and avoid training everything from scratch again? we have kaldi nnet3 model training on 3k hours of data. we have got additional 3k of data and would like retain the existing model rather than starting everything from scratch?

Answer 1 · 2020-05-16T13:38:05.000Z

Yes, kaldi does support transfer learning (which is effectively what you're trying to do here, I assume) - please check the kaldi-help group for details.

Answer 2 · 2020-05-16T13:46:00.000Z

@cogmeta you may have a look at this script for reference: https://github.com/kaldi-asr/kaldi/blob/master/egs/aishell2/s5/local/nnet3/tuning/finetune_tdnn_1a.sh . But keep in mind fine-tuning is a trade-off between what's "old" and "new", mix new data with some proportion of your old data if necessay.

Answer 3 · 2020-05-21T16:02:35.000Z

Actually, the additional amount of data that will be added is more than original data. I am guessing training from scratch might be good idea.