RefactoringMiner

RefactoringMiner is a library/API written in Java that can detect refactorings applied in the history of a Java project.

Currently, it supports the detection of the following refactorings:

  1. Extract Method
  2. Inline Method
  3. Rename Method
  4. Move Method/Attribute
  5. Pull Up Method/Attribute
  6. Push Down Method/Attribute
  7. Extract Superclass/Interface
  8. Move Class
  9. Rename Class
  10. Move and Rename Class
  11. Extract and Move Method
  12. Move Source Folder
  13. Change Package (Move, Rename, Split, Merge)
  14. Extract Variable
  15. Inline Variable
  16. Parameterize Variable
  17. Rename Variable/Parameter
  18. Rename Attribute
  19. Move and Rename Attribute
  20. Replace Variable with Attribute
  21. Replace Attribute (with Attribute)

In order to build the project, run ./gradlew jar (or gradlew jar, in Windows) in the project's root directory. Alternatively, you can generate a complete distribution zip including all runtime dependencies running ./gradlew distZip.

You can also work with the project with Eclipse IDE. First, run ./gradlew eclipse to generate Eclipse project metadata files. Then, import it into Eclipse using the Import Existing Project feature.

Research

If you are using RefactoringMiner in your research, please cite the following paper:

Nikolaos Tsantalis, Matin Mansouri, Laleh Eshkevari, Davood Mazinanian, and Danny Dig, "Accurate and Efficient Refactoring Detection in Commit History," 40th International Conference on Software Engineering (ICSE 2018), Gothenburg, Sweden, May 27 - June 3, 2018.

RefactoringMiner has been used in the following studies:

Contributors

The code in package gr.uom.java.xmi.* has been developed by Nikolaos Tsantalis.

The code in package org.refactoringminer.* has been developed by Danilo Ferreira e Silva.

API usage guidelines

RefactoringMiner can automatically detect refactorings in the entire history of git repositories, between specified commits or tags, or at specified commits.

In the code snippet below we demonstrate how to print all refactorings performed in the toy project https://github.com/danilofes/refactoring-toy-example.git.

GitService gitService = new GitServiceImpl();
GitHistoryRefactoringMiner miner = new GitHistoryRefactoringMinerImpl();

Repository repo = gitService.cloneIfNotExists(
    "tmp/refactoring-toy-example",
    "https://github.com/danilofes/refactoring-toy-example.git");

miner.detectAll(repo, "master", new RefactoringHandler() {
  @Override
  public void handle(RevCommit commitData, List<Refactoring> refactorings) {
    System.out.println("Refactorings at " + commitData.getId().getName());
    for (Refactoring ref : refactorings) {
      System.out.println(ref.toString());
    }
  }
});

You can also analyze between commits using detectBetweenCommits or between tags using detectBetweenTags. RefactoringMiner will start from commit or tag as specified and iterate backwards. If the end commit or end tag is not specified, RefactoringMiner will detect until the first beginning.

miner.detectBetweenCommits(repo, 
    "d4bce13a443cf12da40a77c16c1e591f4f985b47", "dde3ef036bdddae550c4e98373db4c81d77b5043",
    new RefactoringHandler() {
  @Override
  public void handle(RevCommit commitData, List<Refactoring> refactorings) {
    System.out.println("Refactorings at " + commitData.getId().getName());
    for (Refactoring ref : refactorings) {
      System.out.println(ref.toString());
    }
  }
});
miner.detectBetweenTags(repo, "1.1", "1.0", new RefactoringHandler() {
  @Override
  public void handle(RevCommit commitData, List<Refactoring> refactorings) {
    System.out.println("Refactorings at " + commitData.getId().getName());
    for (Refactoring ref : refactorings) {
      System.out.println(ref.toString());
    }
  }
});

It is possible to analyze a specifc commit using detectAtCommit instead of detectAll. The commit is identified by its SHA key, such as in the example below:

miner.detectAtCommit(repo, "https://github.com/danilofes/refactoring-toy-example.git",
    "05c1e773878bbacae64112f70964f4f2f7944398", new RefactoringHandler() {
  @Override
  public void handle(RevCommit commitData, List<Refactoring> refactorings) {
    System.out.println("Refactorings at " + commitData.getId().getName());
    for (Refactoring ref : refactorings) {
      System.out.println(ref.toString());
    }
  }
});

You can get the churn of a specific commit using churnAtCommit as follows:

Churn churn = miner.churnAtCommit(repo, "05c1e773878bbacae64112f70964f4f2f7944398", handler);

There is also a lower level API that works comparing the source code from two folders that contain the code before and after the code changes:

// Assuming you have a List<String> of the changed/added/removed file paths from version1 to version2
// filename format example: /src/gr/uom/java/xmi/UMLModelASTReader.java

File rootFolder1 = new File("/path/to/version1/");
File rootFolder2 = new File("/path/to/version2/");
List<String> filePaths1 = new ArrayList<String>();
filePaths1.add("/src/package/Foo.java");
List<String> filePaths2 = new ArrayList<String>();
filePaths2.add("/src/package/Foo.java");

UMLModel model1 = new UMLModelASTReader(rootFolder1, filePaths1).getUmlModel();
UMLModel model2 = new UMLModelASTReader(rootFolder2, filePaths2).getUmlModel();
UMLModelDiff modelDiff = model1.diff(model2);
List<Refactoring> refactorings = modelDiff.getRefactorings();

Note that by doing this you may get different results from the git history analysis, because the latter uses information from git to better identify moves and renames.

Location information for the detected refactorings

All classes implementing the Refactoring interface include refactoring-specific location information. For example, ExtractOperationRefactoring offers the following methods:

  1. getSourceOperationCodeRangeBeforeExtraction() : Returns the code range of the source method in the parent commit
  2. getSourceOperationCodeRangeAfterExtraction() : Returns the code range of the source method in the child commit
  3. getExtractedOperationCodeRange() : Returns the code range of the extracted method in the child commit
  4. getExtractedCodeRangeFromSourceOperation() : Returns the code range of the extracted code fragment from the source method in the parent commit
  5. getExtractedCodeRangeToExtractedOperation() : Returns the code range of the extracted code fragment to the extracted method in the child commit
  6. getExtractedOperationInvocationCodeRange() : Returns the code range of the invocation to the extracted method inside the source method in the child commit

Each method returns a CodeRange object including the following properties:

String filePath
int startLine
int endLine
int startColumn
int endColumn

Running from the command line

When you build a distributable application with ./gradlew distZip, you can run Refactoring Miner as a command line application. Extract the file under build/distribution/RefactoringMiner.zip in the desired location, and cd into the bin folder (or include it in your path). Then, run RefactoringMiner -h to show its usage:

> RefactoringMiner -h

-h															Show tips
-a <git-repo-folder> <branch>								Detect all refactorings at <branch> for <git-repo-folder>. If <branch> is not specified, commits from all branches are analyzed.
-bc <git-repo-folder> <start-commit-sha1> <end-commit-sha1>	Detect refactorings Between <star-commit-sha1> and <end-commit-sha1> for project <git-repo-folder>
-bt <git-repo-folder> <start-tag> <end-tag>					Detect refactorings Between <start-tag> and <end-tag> for project <git-repo-folder>
-c <git-repo-folder> <commit-sha1>							Detect refactorings at specified commit <commit-sha1> for project <git-repo-folder>

For example, suppose that you run:

> git clone https://github.com/danilofes/refactoring-toy-example.git refactoring-toy-example
> RefactoringMiner -c refactoring-toy-example 36287f7c3b09eff78395267a3ac0d7da067863fd

The output would be:

4 refactorings found in commit 36287f7c3b09eff78395267a3ac0d7da067863fd:
  Pull Up Attribute     private age : int from class org.animals.Labrador to class org.animals.Dog
  Pull Up Attribute     private age : int from class org.animals.Poodle to class org.animals.Dog
  Pull Up Method        public getAge() : int from class org.animals.Labrador to public getAge() : int from class org.animals.Dog
  Pull Up Method        public getAge() : int from class org.animals.Poodle to public getAge() : int from class org.animals.Dog

When you run Refactoring with -a, -bc, -bt, after all commits are analyzed, a result csv file which use semicolon ; as delimiter will be generated in the repository directory.