SATD refers to self-admitted technical debt, which is introduced intentionally (e.g., through temporary fix) and admitted by developers themselves and always recorded in source code comments. SATD Detector[1] is a tool that is able to automatically detect SATD comments text mining. This is the back-end of SATD Detector, which provides command-line interface and Java API of SATD Detector. More details can be found in [1].
Download binaries: satd_detector.jar
Use build-in models to test whether a comment is SATD comment or not:
$ java -jar satd_detector.jar test
>This is an ugly implementation.
SATD
>This function read lines from file.
Not SATD
>/exit
bye!
To train your own models, you need provide three data files:
- Comment file: contains all the comments in your dataset. One line for each comment.
- Label file: contains the labels of comments.
- Project file: contains the project names of comments.
Comment file example: comments.txt
// This is comment 1
// TODO: fix this function later
/* This is comment 3 */
Label file example: labels.txt
No
Yes
No
Project file example: projects.txt
project1
project2
project3
Train your own model
$ java -jar satd_detector.jar train -comments comments.txt -labels labels.txt -projects projects.txt -out_dir ./models/
Finish Saving classifier of project1
Finish Saving classifier of project2
Finish Saving classifier of project3
Now you can see 9 files in ./models/
:
- models
- project1.as
- project1.model
- project1.stw
- project2.as
- project2.model
- project2.stw
- project3.as
- project3.model
- project3.stw
All these files will be loaded while classifying.
$ java -jar satd_detector.jar test -model_dir ./models/
>This is an ugly implementation.
SATD
>This function read lines from file.
Not SATD
>/exit
bye!
You can also use SATD Detector to test all comments saved in a text file (separated by newline
).
SATD Detector will read, analyse and write the results in a new text file containing only the results, as "SATD" or "not", in the exact same path previously provided + ".result".
$ java -jar satd_detector.jar test -comment_file /path/to/comment-file.txt
bye!
Results text file: /path/to/comment-file.txt.result
You can also use SATD Detector in your Java projects.
- Download satd_detector.jar
- Add it to the classpath
import satd_detector.core.train.Train;
String commentFile = "./comments.txt";
String labelsFile = "./labels.txt";
String projectsFile = "./projects.txt";
String outDir = "./models/";
// train your own models
try{
Train.buildModels(commentFile, labelFile, projectFile, outDir);
} catch (Exception e) {
e.printStackTrace();
}
import satd_detector.core.utils.SATDDetector;
// create an instance using build-in models
SATDDetector detector1 = new SATDDetector();
String comment = "This is an ugly implementation."
boolean result = detector1.isSATD(comment)
// create an instance using your own models
String modelDir = "./models/"
SATDDetector detector2 = new SATDDetector(modelDir);
boolean result = detector2.isSATD(comment)
$ java -jar satd_detector.jar test -h
usage: test
-h Show help message
-model_dir <arg> Dir which stores all the models. Using build-in models
if not specified.
$ java -jar satd_detector.jar train -h
usage: train
-comments <arg> The file which stores all comments.
-h Show help messages.
-labels <arg> The file which stores labels of all comments.
-out_dir <arg> Dir to store trained models.
-projects <arg> The file which stores projects of all comments.
- weka 3.8.1 or above
- snowball-stemmers
- which can be installed using weka package manager
- Apache commons cli 1.4 or above
- Git clone or download zip
- Add
resources
folder to your classpath - Add
weka.jar
,snowball-stemmers.jar
andcommons-cli.jar
to your classpath
For Eclipse, you can modify classpath quickly by:
- Righ-click on the project
- Choose
Build Path
->Configure Build Path
- for
resources
folder, chooseSource
tab -> Add Folder - for jar files, choose
Libraries
tab -> Add External JARs
- for
[1] SATD Detector: A Text-Mining-Based Self-Admitted Technical Debt Detection Tool.