Primary LanguagePython


Dataset Downloads

Extract the 2 dataset into data_codeflaws and data_nbl respectively


The following dependencies are required to train or run the model:

  • pytorch (1.6.0+)
  • fasttext (0.9.2)
  • dgl (1.6.0)
  • networkx (2.5.1)

Optional (for visualization):

  • pygraphviz and graphviz

Java: version 1.11+

Extract jars.zip into folder jars


# Codeflaws node-level
python3 -m codeflaws.train_nx_a_nc_cfl
# NBL node-level
python3 -m nbl.train_nx_a_nc

# Codeflaws statement-level
python3 -m codeflaws.train_nx_astdiff_nocontent_gumtree
# Prutor statement-level
python3 -m nbl.train_nx_astdiff_nocontent_gumtree


The training script already contain evaluation, by disabling train() procedure, the script will script directly to evaluation. Copy the pretrained model into train_dirs in the configured utils/utils.py for evaluate.

Pretrained model:


Please note that while this is not required in our original settings, codebert pretrain file can be placed in preprocess folder for each AST content to be used instead of just nodetype.