hebrew_ULMFiT
Universal Language Model Fine-tuning for Text Classification in Hebew, plus bunus
i happy to share the weight to the hebew ULMFiT model.
ULMFiT published by Jeremy Howard and Sebastian Ruder here, and touch in Fast.ai course.
this model is very strong, because he can be easily tranfer to any kind of classification you want.
the hebrew wikipedia dowload from professor Yoav Golberg web.
download hebew models: here
-
wiki training
- hebrew_wiki_part_1.ipynb
trainaing on wiki from scrach. - hebrew_wiki_part_2.ipynb
remove unwanted chart and retrain.
- hebrew_wiki_part_1.ipynb
-
Bonus - amit segal models.
- amit_segal_data.ipynb
collect amit segal data
- amit_segal_language_model.ipynb
train a language model on this corpus. - amit_classification.ipynb
tranfer the model to make classification between correct and wrong sentence.
the model achive 0.68 accuracy, which is impressing because the data size & the possible that sentence look
real (because good predict) and the other side.
- amit_segal_data.ipynb
-
load pre-train models and word map
load_models_word_map.ipynb
after all the model are create, simple load the models, and create word map (with the embedding) from wiki and amit models.