retrain existing nnet3 model with more data
cogmeta opened this issue · 3 comments
Is there a way to retrain existing nnet3 mode with additional data and avoid training everything from scratch again? we have kaldi nnet3 model training on 3k hours of data. we have got additional 3k of data and would like retain the existing model rather than starting everything from scratch?
Yes, kaldi does support transfer learning (which is effectively what you're trying to do here, I assume) - please check the kaldi-help group for details.
@cogmeta you may have a look at this script for reference: https://github.com/kaldi-asr/kaldi/blob/master/egs/aishell2/s5/local/nnet3/tuning/finetune_tdnn_1a.sh . But keep in mind fine-tuning is a trade-off between what's "old" and "new", mix new data with some proportion of your old data if necessay.
Actually, the additional amount of data that will be added is more than original data. I am guessing training from scratch might be good idea.