This repository contains the code used in the paper "Sentiment Analysis of Pull Requests".
The initial checking and storing of pull requests is done in Preprocess.ipynb
.
This is where we generate our dataset of 10k pull requests and their comments.
- run
python3 convert.py
- run
java -jar SentiStrength-SE_V1.5.jar
- Detect sentiments for each data file
a.data/input-se-both.txt
b.data/input-se-issues.txt
c.data/input-se-review.txt
- Save the outputs the the following files:
a.data/se-both.txt
b.data/se-issue
c.data/se-review
- run
python3 convert_senti-strength-se.py
Our model fitting is done in logistic_regression.py
.
keras_nn.py
is the Keras model we used as a sanity check.
sentiment-both.csv
, sentiment-issue.csv
, and sentiment-review.csv
are data files for use in analysis.
They contain features for issues and code review comments, just issues, and just code review comments, respectively.