allenai/bilm-tf

How to restart training of a model

ashishjain1988 opened this issue · 2 comments

I have a very big dataset so I split it into 100 files. Now I want to train them one by one in batches. I see there is a restart training option (restart_ckpt_file) in the train function. Which file needs to be given input in it and how we can use it?

When you train a BILM you normaly have a directory with many (all) training files and one with some validation files. You train on all training files for one or more epoches. Then you evaluate the perplexity with test and restart for an other epoch. You do always train all data for at least one epoch.

Evaluate see here: https://github.com/allenai/bilm-tf#3-evaluate-the-trained-model

Restart see here: https://github.com/allenai/bilm-tf#how-to-do-fine-tune-a-model-on-additional-unlabeled-data

Maybe this helps (sorry German language): https://eniak.de/it/training_of_german_word_embedding_for_nlp

Thank you for the information!