/gpt2-ngrams-for-next-word-prediction

Predicting next word using Probability Distribution of Ngrams in Training Corpus. Also added GPT2 fine-tuned implementation recently. Implementation is using R and Shiny Dashboard. This is meant for predicting next words in mobile devices, or emails etc. Python training in NN also attempted

Primary LanguageJupyter Notebook

Predicting next word using Probability Distribution of Ngrams in Training Corpus.

  • Have implemented Markov Chain like model for 2 and 3 word ngrams ( 5 words did not fit in free tier shiny)

Link to just the Shiny Dashboard implementation

Shiny Dashboard Implementation

TODO

  • Calculate probabilities for single word also -1 gram - ( maybe top 1k words only)
  • Remove low frequency ngrams
  • Implement statistical testing - (chi squared)

GPT2 implementation in another branch

  • Flask endpoint for faster prediction done separately. Code changed to load model object only the first time.