/dsci-benchmark

R scripts for benchmarking next word prediction algorithms developed for the Coursera Data Science Capstone Project.

Primary LanguageR

Next word prediction benchmark

A simple R script for benchmarking a next word prediction algorithm.

Usage:

  1. Download the repository
  2. Extract data.zip into the current folder (password is provided in the Coursera forum)
  3. Open benchmark.R and run the code up to section 03
  4. (optional) create a wrapper function for your prediction function (section 03)
  5. Perform the benchmark (section 04)
  6. Report your results in the Coursera Forum

File description:

  • data.zip Archive containing the benchmark datasets.
  • benchmark.R Script needed to perform the benchmark (see above).
  • generate_dataset.R Script used to generate the benchmark datasets (this should not be re-run and is provided for reference only)