textPrediction

Purpose

This is related to the Data Science capstone on Coursera. The mission is to take some provided data sources and perform next word text prediction based on user input.

Workflow

  1. Acquire the data files
  2. Build a corpus
  3. Clean the corpus
  4. Build a Term Document Matrix
  5. Collapse the Term Document Matrix to simple frequency of ngrams
  6. Predict, for a given input, the next word