/fy2015-replication

Replication of Fei and Yeung (2015), "Temporal Models for Predicting Student Dropout in Massive Open Online Courses".

Primary LanguagePythonMIT LicenseMIT

fy2015-replication

This repository contains the complete configuration files necessary for the replication of Fei and Yeung (2015), "Temporal Models for Predicting Student Dropout in Massive Open Online Courses" using the MOOC Replication Framework (MORF). The complete results of this replication are described in Gardner, Yang, Baker, and Brooks (2018), "Enabling End-To-End Machine Learning Replicability: A Case Study in Educational Data Mining."

Guide to the contents of this repo:

docker: contains dockerfile and necessary scripts to build the docker image. This image can also be pulled directly from docker cloud by running the following in a terminal (note Docker must be installed):

docker pull themorf/morf-public:fy2015-replication

config: contains two subdirectories, holdout and cv, with configuration files to reproduce the experiment using the holdout and cross-validation architectures, respectively. Note that weeks are zero-indexed (so week_0 actually uses one week of features, and week_4 uses weeks one through five, utilizing the method described in the original Fei and Yeung paper).

Executing the experiments described in this repo:

To execute one of the trials described here (where a trial is a specific model evaluated with features up to a specific week number), use the MORF API functions:

from morf.utils.submit import easy_submit
easy_submit(client_config_url="https://raw.githubusercontent.com/educational-technology-collective/fy2015-replication/master/config/holdout/week_4/svm/controller.py", email_to="your-email@example.com")

Note that the complete extraction-training-testing pipeline may take several hours. Also note that if you are using a job which utilizes fork_features(), the job it is forking from must be executed first.

Each experiment also includes a persistent Digital Object Identifier which contains links to the client.config and controller scripts, which, along with the Docker image described above (which is common to all of the trials), fully reproduces every trial of the experiment.

Experiment Week Model Zenodo Deposition ID DOI
holdout 0 LR 1275035 DOI
holdout 0 RNN 1275045 DOI
holdout 0 SVM 1275193 DOI
holdout 0 LSTM 1275041 DOI
holdout 1 LR 1275049 DOI
holdout 1 LSTM 1275055 DOI
holdout 1 RNN 1275059 DOI
holdout 1 SVM 1275197 DOI
holdout 2 LR 1275063 DOI
holdout 2 LSTM 1275071 DOI
holdout 2 RNN 1275074 DOI
holdout 2 SVM 1275201 DOI
holdout 3 LR 1275077 DOI
holdout 3 RNN 1275081 DOI
holdout 3 LSTM 1275083 DOI
holdout 3 SVM 1275203 DOI
holdout 4 LR 1275331 DOI
holdout 4 RNN 1275335 DOI
holdout 4 LSTM 1275339 DOI
holdout 4 SVM 1275341 DOI
cv 0 LR 1275087 DOI
cv 0 RNN 1275091 DOI
cv 0 LSTM 1275095 DOI
cv 0 SVM 1275207 DOI
cv 1 LR 1275101 DOI
cv 1 RNN 1275103 DOI
cv 1 LSTM 1275107 DOI
cv 1 SVM 1275211 DOI
cv 2 LR 1275113 DOI
cv 2 RNN 1275119 DOI
cv 2 LSTM 1275121 DOI
cv 2 SVM 1275213 DOI
cv 3 LR 1275129 DOI
cv 3 RNN 1275133 DOI
cv 3 LSTM 1275135 DOI
cv 3 SVM 1275215 DOI
cv 4 LR 1275345 DOI
cv 4 RNN 1275347 DOI
cv 4 LSTM 1275351 DOI
cv 4 SVM 1275355 DOI