This is related to the Data Science capstone on Coursera. The mission is to take some provided data sources and perform next word text prediction based on user input.
- Acquire the data files
- Build a corpus
- Clean the corpus
- Build a Term Document Matrix
- Collapse the Term Document Matrix to simple frequency of ngrams
- Predict, for a given input, the next word