/CUP

Just-In-Time Comment UPdater

Primary LanguagePython

CUP: JIT Comment UPdater

Directory Structure

  • baselines: the implementation of nnupdater and scripts for running baselines
  • models: neural models
  • scripts: the scripts for conducting experiments

Prepare Requirements

  • Java 8
  • Install python dependencies through conda
  • Install nlg-eval and setup
conda env create -f environment.yml

pip install git+https://github.com/Maluuba/nlg-eval.git@master
# set the data_path
nlg-eval --setup ${data_path}

Download Dataset

  • Our dataset, trained model and archived results can be downloaded from here
  • Another archive of this project can be found at https://tinyurl.com/jitcomment
  • By default, we store the dataset in ../dataset

Train

cd scripts
# 0 for GPU 0
./train_model.sh 0 CoAttnBPBAUpdater models.updater.CoAttnBPBAUpdater ../dataset

Infer and Evaluate

cd scripts
./infer_eval.sh 0 CoAttnBPBAUpdater models.updater.CoAttnBPBAUpdater ../dataset

Build Vocab Yourself

You can also build the vocabularies by yourself instead of using the one provided with our dataset.

# download fastText pre-trained model
cd ../dataset
wget https://dl.fbaipublicfiles.com/fasttext/vectors-crawl/cc.en.300.bin.gz
gunzip cc.en.300.bin.gz

cd scripts
./build_vocab.sh ../../dataset/cc.en.300.bin

Run Baselines

  • Clone FracoUpdater:
# clone FracoUpdater
git clone https://github.com/Tbabm/FracoUpdater
  • Install FracoUpdater's dependencies according to its README
  • Run
python -m baselines.run_baselines run_all_baselines

The results will be placed in the dataset directory, and can be evaluated using CUP/eval.py

Run CUP's Variants

cd scripts
# 0 for GPU 0
./run_variants.sh 0

Get Readable Result

python -m tools dump_all_readables

readable files can be found in results