Sentiment Analysis of Pull Requests

This repository contains the code used in the paper "Sentiment Analysis of Pull Requests".

Structure

Data Prep

The initial checking and storing of pull requests is done in Preprocess.ipynb. This is where we generate our dataset of 10k pull requests and their comments.

Sentiment Analysis

run python3 convert.py
run java -jar SentiStrength-SE_V1.5.jar
Detect sentiments for each data file
a. data/input-se-both.txt
b. data/input-se-issues.txt
c. data/input-se-review.txt
Save the outputs the the following files:
a. data/se-both.txt
b. data/se-issue
c. data/se-review
run python3 convert_senti-strength-se.py

Analysis

Our model fitting is done in logistic_regression.py. keras_nn.py is the Keras model we used as a sanity check.

Data

sentiment-both.csv, sentiment-issue.csv, and sentiment-review.csv are data files for use in analysis. They contain features for issues and code review comments, just issues, and just code review comments, respectively.

rnett/sa-of-prs

Sentiment Analysis of Pull Requests

Structure

Data Prep

Sentiment Analysis

Analysis

Data