CharacterWordModels

We try to use student-teacher paradigm to train a student character based language model from 'techer' word level language model. Since character and word based models learn slightly difference 'information', using a word level model (alongwith original data) may help in making the character model learn some characteristics of the word level model (and hopefully retaining information learnt by a character model.)