hebrew_ULMFiT

Universal Language Model Fine-tuning for Text Classification in Hebew, plus bunus

i happy to share the weight to the hebew ULMFiT model.
ULMFiT published by Jeremy Howard and Sebastian Ruder here, and touch in Fast.ai course.
this model is very strong, because he can be easily tranfer to any kind of classification you want.
the hebrew wikipedia dowload from professor Yoav Golberg web.
download hebew models: here

wiki training
1. hebrew_wiki_part_1.ipynb
  trainaing on wiki from scrach.
2. hebrew_wiki_part_2.ipynb
  remove unwanted chart and retrain.
Bonus - amit segal models.
1. amit_segal_data.ipynb
  collect amit segal data
2. amit_segal_language_model.ipynb
  train a language model on this corpus.
3. amit_classification.ipynb
  tranfer the model to make classification between correct and wrong sentence.
  the model achive 0.68 accuracy, which is impressing because the data size & the possible that sentence look
  real (because good predict) and the other side.
load pre-train models and word map
load_models_word_map.ipynb
after all the model are create, simple load the models, and create word map (with the embedding) from wiki and amit models.

marklr/hebrew_ULMFiT

hebrew_ULMFiT