A predictor for important pull requests. Designed to be invoked by the analyzer.
The predictor uses historical data of a given repository to predict if a repository requires more attention than others. Machine learning is applied to perform the prediction. Random Forest is used as algorithm, which is implemented in R.
Please note that the predictor is specifically written for the GHTorrent project.
- Clone the project into
~/predictor
- Install dependencies and build the project with
sbt compile
- Copy
src/main/resources/settings.properties.dist
tosrc/main/resources/settings.properties
- Configure the application by editing
src/main/resources/settings.properties
- e.g. model directory:
~/tmp/
- e.g. Rscript location:
/usr/bin/Rscript
- e.g. script diretory:
~/predictor/R
- Ingore the repository settings
- Package the project into a
.jar
file withsbt assembly
The predictor is now set up for use by the analyzer.