- General info
- How to build RefactoringMiner
- How to use RefactoringMiner as a maven dependency
- Chrome extension
- Research
- Support for other programming languages
- Contributors
- API usage guidelines
- Location information for the detected refactorings
- Statement matching information for the detected refactorings
- Running RefactoringMiner from the command line
RefactoringMiner is a library/API written in Java that can detect refactorings applied in the history of a Java project.
Currently, it supports the detection of the following refactorings:
supported by RefactoringMiner 1.0 and newer versions
- Extract Method
- Inline Method
- Rename Method
- Move Method
- Move Attribute
- Pull Up Method
- Pull Up Attribute
- Push Down Method
- Push Down Attribute
- Extract Superclass
- Extract Interface
- Move Class
- Rename Class
- Extract and Move Method
- Rename Package
Change Package (Move, Rename, Split, Merge)
supported by RefactoringMiner 2.0 and newer versions
- Move and Rename Class
- Extract Class
- Extract Subclass
- Extract Variable
- Inline Variable
- Parameterize Variable
- Rename Variable
- Rename Parameter
- Rename Attribute
- Move and Rename Attribute
- Replace Variable with Attribute
- Replace Attribute (with Attribute)
- Merge Variable
- Merge Parameter
- Merge Attribute
- Split Variable
- Split Parameter
- Split Attribute
- Change Variable Type
- Change Parameter Type
- Change Return Type
- Change Attribute Type
- Extract Attribute
- Move and Rename Method
- Move and Inline Method
supported by RefactoringMiner 2.1 and newer versions
- Add Method Annotation
- Remove Method Annotation
- Modify Method Annotation
- Add Attribute Annotation
- Remove Attribute Annotation
- Modify Attribute Annotation
- Add Class Annotation
- Remove Class Annotation
- Modify Class Annotation
- Add Parameter Annotation
- Remove Parameter Annotation
- Modify Parameter Annotation
- Add Variable Annotation
- Remove Variable Annotation
- Modify Variable Annotation
- Add Parameter
- Remove Parameter
- Reorder Parameter
- Add Thrown Exception Type
- Remove Thrown Exception Type
- Change Thrown Exception Type
- Change Method Access Modifier
supported by RefactoringMiner 2.2 and newer versions
- Change Attribute Access Modifier
- Encapsulate Attribute
- Parameterize Attribute
- Replace Attribute with Variable
- Add Method Modifier (
final
,static
,abstract
,synchronized
) - Remove Method Modifier (
final
,static
,abstract
,synchronized
) - Add Attribute Modifier (
final
,static
,transient
,volatile
) - Remove Attribute Modifier (
final
,static
,transient
,volatile
) - Add Variable Modifier (
final
) - Add Parameter Modifier (
final
) - Remove Variable Modifier (
final
) - Remove Parameter Modifier (
final
) - Change Class Access Modifier
- Add Class Modifier (
final
,static
,abstract
) - Remove Class Modifier (
final
,static
,abstract
) - Move Package
- Split Package
- Merge Package
- Localize Parameter
- Change Type Declaration Kind (
class
,interface
,enum
) - Collapse Hierarchy
- Replace Loop with Pipeline
- Replace Anonymous with Lambda
supported by RefactoringMiner 2.3 and newer versions
- Merge Class
- Inline Attribute
- Replace Pipeline with Loop
supported by RefactoringMiner 2.3.2
- Split Class
In order to build the project, run ./gradlew jar
(or gradlew jar
, in Windows) in the project's root directory.
Alternatively, you can generate a complete distribution zip including all runtime dependencies running ./gradlew distZip
.
You can also work with the project with Eclipse IDE. First, run ./gradlew eclipse
to generate Eclipse project metadata files. Then, import it into Eclipse using the Import Existing Project feature.
Since version 2.0, RefactoringMiner is available in the Maven Central Repository. In order to use RefactoringMiner as a maven dependency in your project, add the following snippet to your project's build configuration file:
<dependency>
<groupId>com.github.tsantalis</groupId>
<artifactId>refactoring-miner</artifactId>
<version>2.2.0</version>
</dependency>
If you want to get refactoring information when inspecting a commit on GitHub, you can install our Refactoring Aware Commit Review Chrome Extension.
The Chrome extension can detect refactorings for public projects and commits matching the following URL patterns:
https://github.com/user/project/commit/id
https://github.com/user/project/pull/id/commits/id
If you are using RefactoringMiner in your research, please cite the following papers:
Nikolaos Tsantalis, Matin Mansouri, Laleh Eshkevari, Davood Mazinanian, and Danny Dig, "Accurate and Efficient Refactoring Detection in Commit History," 40th International Conference on Software Engineering (ICSE 2018), Gothenburg, Sweden, May 27 - June 3, 2018.
@inproceedings{Tsantalis:ICSE:2018:RefactoringMiner,
author = {Tsantalis, Nikolaos and Mansouri, Matin and Eshkevari, Laleh M. and Mazinanian, Davood and Dig, Danny},
title = {Accurate and Efficient Refactoring Detection in Commit History},
booktitle = {Proceedings of the 40th International Conference on Software Engineering},
series = {ICSE '18},
year = {2018},
isbn = {978-1-4503-5638-1},
location = {Gothenburg, Sweden},
pages = {483--494},
numpages = {12},
url = {http://doi.acm.org/10.1145/3180155.3180206},
doi = {10.1145/3180155.3180206},
acmid = {3180206},
publisher = {ACM},
address = {New York, NY, USA},
keywords = {Git, Oracle, abstract syntax tree, accuracy, commit, refactoring},
}
Nikolaos Tsantalis, Ameya Ketkar, and Danny Dig, "RefactoringMiner 2.0," IEEE Transactions on Software Engineering, vol. 48, no. 3, pp. 930-950, March 2022.
@article{Tsantalis:TSE:2020:RefactoringMiner2.0,
author={Tsantalis, Nikolaos and Ketkar, Ameya and Dig, Danny},
title={RefactoringMiner 2.0},
journal={IEEE Transactions on Software Engineering},
year={2022},
volume={48},
number={3},
pages={930-950},
doi={10.1109/TSE.2020.3007722}
}
Keynote at the Fifth International Workshop on Refactoring (IWoR 2021)
RefactoringMiner has been used in the following studies:
- Danilo Silva, Nikolaos Tsantalis, and Marco Tulio Valente, "Why We Refactor? Confessions of GitHub Contributors," 24th ACM SIGSOFT International Symposium on the Foundations of Software Engineering (FSE 2016), Seattle, WA, USA, November 13-18, 2016.
- Davood Mazinanian, Ameya Ketkar, Nikolaos Tsantalis, and Danny Dig, "Understanding the use of lambda expressions in Java", Proceedings of the ACM on Programming Languages, vol. 1, issue OOPSLA, Article 85, 31 pages, October 2017.
- Diego Cedrim, Alessandro Garcia, Melina Mongiovi, Rohit Gheyi, Leonardo Sousa, Rafael de Mello, Baldoino Fonseca, Márcio Ribeiro, and Alexander Chávez, "Understanding the impact of refactoring on smells: a longitudinal study of 23 software projects," 11th Joint Meeting on Foundations of Software Engineering (ESEC/FSE 2017), Paderborn, Germany, September 4-8, 2017.
- Alexander Chávez, Isabella Ferreira, Eduardo Fernandes, Diego Cedrim, and Alessandro Garcia, "How does refactoring affect internal quality attributes?: A multi-project study," 31st Brazilian Symposium on Software Engineering (SBES 2017), Fortaleza, CE, Brazil, September 20-22, 2017.
- Navdeep Singh, and Paramvir Singh, "How Do Code Refactoring Activities Impact Software Developers' Sentiments? - An Empirical Investigation Into GitHub Commits," 24th Asia-Pacific Software Engineering Conference (APSEC 2017), Nanjing, Jiangsu, China, December 4-8, 2017.
- Mehran Mahmoudi, and Sarah Nadi, "The Android Update Problem: An Empirical Study," 15th International Conference on Mining Software Repositories (MSR 2018), Gothenburg, Sweden, May 28-29, 2018.
- Anthony Peruma, Mohamed Wiem Mkaouer, Michael J. Decker, and Christian D. Newman, "An empirical investigation of how and why developers rename identifiers," 2nd International Workshop on Refactoring (IWoR 2018), Montpellier, France, September 4, 2018.
- Patanamon Thongtanunam, Weiyi Shang, and Ahmed E. Hassan, "Will this clone be short-lived? Towards a better understanding of the characteristics of short-lived clones," Empirical Software Engineering, Volume 24, Issue 2, pp. 937–972, April 2019.
- Isabella Ferreira, Eduardo Fernandes, Diego Cedrim, Anderson Uchôa, Ana Carla Bibiano, Alessandro Garcia, João Lucas Correia, Filipe Santos, Gabriel Nunes, Caio Barbosa, Baldoino Fonseca, and Rafael de Mello, "The buggy side of code refactoring: understanding the relationship between refactorings and bugs," 40th International Conference on Software Engineering: Companion Proceedings (ICSE 2018), Gothenburg, Sweden, May 27-June 3, 2018.
- Matheus Paixao, "Software Restructuring: Understanding Longitudinal Architectural Changes and Refactoring," Ph.D. thesis, Computer Science Department, University College London, July 2018.
- Mehran Mahmoudi, Sarah Nadi, and Nikolaos Tsantalis, "Are Refactorings to Blame? An Empirical Study of Refactorings in Merge Conflicts," 26th IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER 2019), Hangzhou, China, February 24-27, 2019.
- Bin Lin, Csaba Nagy, Gabriele Bavota and Michele Lanza, "On the Impact of Refactoring Operations on Code Naturalness," 26th IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER 2019), Hangzhou, China, February 24-27, 2019.
- Sarah Fakhoury, Devjeet Roy, Sk. Adnan Hassan, and Venera Arnaoudova, "Improving Source Code Readability: Theory and Practice," 27th IEEE/ACM International Conference on Program Comprehension (ICPC 2019), Montreal, QC, Canada, May 25-26, 2019.
- Carmine Vassallo, Giovanni Grano, Fabio Palomba, Harald C. Gall, and Alberto Bacchelli, "A large-scale empirical exploration on refactoring activities in open source software projects," Science of Computer Programming, Volume 180, Pages 1-15, July 2019.
- Eman Abdullah AlOmar, Mohamed Wiem Mkaouer, and Ali Ouni, "Can refactoring be self-affirmed?: An exploratory study on how developers document their refactoring activities in commit messages," 3rd International Workshop on Refactoring (IWOR 2019), Montreal, QC, Canada, May 28, 2019.
- Ana Carla Bibiano, Eduardo Fernandes, Daniel Oliveira, Alessandro Garcia, Marcos Kalinowski, Baldoino Fonseca, Roberto Oliveira, Anderson Oliveira, and Diego Cedrim, "A Quantitative Study on Characteristics and Effect of Batch Refactoring on Code Smells," 13th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM 2019), Porto de Galinhas, Brazil, September 16-20, 2019.
- Eman Abdullah AlOmar, Mohamed Wiem Mkaouer, Ali Ouni, and Marouane Kessentini, "On the Impact of Refactoring on the Relationship between Quality Attributes and Design Metrics," 13th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM 2019), Porto de Galinhas, Brazil, September 16-20, 2019.
- Edmilson Campos Neto, Daniel Alencar da Costa, and Uirá Kulesza, "Revisiting and Improving SZZ Implementations," 13th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM 2019), Porto de Galinhas, Brazil, September 16-20, 2019.
- Valentina Lenarduzzi, Nyyti Saarimäki, and Davide Taibi, "The Technical Debt Dataset," 15th International Conference on Predictive Models and Data Analytics in Software Engineering (PROMISE 2019), Porto de Galinhas, Brazil, September 18, 2019.
- Anthony Peruma, "A preliminary study of Android refactorings," 6th International Conference on Mobile Software Engineering and Systems (MOBILESoft 2019), Montreal, Quebec, Canada, May 25-26, 2019.
- Anthony Peruma, Mohamed Wiem Mkaouer, Michael J. Decker, and Christian D. Newman, "Contextualizing Rename Decisions using Refactorings and Commit Messages," 19th IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM 2019), Cleveland, OH, USA, September 30-October 1, 2019.
- Soumaya Rebai, Oussama Ben Sghaier, Vahid Alizadeh, Marouane Kessentini, and Meriem Chater, "Interactive Refactoring Documentation Bot," 19th IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM 2019), Cleveland, OH, USA, September 30-October 1, 2019.
- Matheus Paixao, and Paulo Henrique Maia, "Rebasing in Code Review Considered Harmful: A Large-Scale Empirical Investigation," 19th IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM 2019), Cleveland, OH, USA, September 30-October 1, 2019.
- Willian Oizumi, Leonardo Da Silva Sousa, Anderson Oliveira, Luiz Matheus Alencar, Alessandro Garcia, Thelma E. Colanzi and Roberto Oliveira, "On the density and diversity of degradation symptoms in refactored classes: A multi-case study," 30th International Symposium on Software Reliability Engineering (ISSRE 2019), Berlin, Germany, October 28-31, 2019.
- Marcos César de Oliveira, Davi Freitas, Rodrigo Bonifácio, Gustavo Pinto, and David Lo, "Finding Needles in a Haystack: Leveraging Co-change Dependencies to Recommend Refactorings," Journal of Systems and Software, Volume 158, December 2019.
- Walter Lucas, Rodrigo Bonifácio, Edna Dias Canedo, Diego Marcílio, and Fernanda Lima, "Does the Introduction of Lambda Expressions Improve the Comprehension of Java Programs?," XXXIII Brazilian Symposium on Software Engineering (SBES 2019), Salvador, Brazil, September 23-27, 2019.
- Bo Shen, Wei Zhang, Haiyan Zhao, Guangtai Liang, Zhi Jin, and Qianxiang Wang, "IntelliMerge: A Refactoring-Aware Software Merging Technique," Proceedings of the ACM on Programming Languages, vol. 3, OOPSLA, Article 170, October 2019.
- Martina Iammarino, Fiorella Zampetti, Lerina Aversano, and Massimiliano Di Penta, "Self-Admitted Technical Debt Removal and Refactoring Actions: Co-Occurrence or More?," 35th IEEE International Conference on Software Maintenance and Evolution (ICSME 2019), Cleveland, OH, USA, September 29-October 4, 2019.
- Ally S. Nyamawe, Hui Liu, Nan Niu, Qasim Umer, and Zhendong Niu, "Automated Recommendation of Software Refactorings based on Feature Requests," 27th IEEE International Requirements Engineering Conference (RE 2019), Jeju Island, South Korea, September 23-27, 2019.
- Maurício Aniche, Erick Maziero, Rafael Durelli, and Vinicius Durelli, "The Effectiveness of Supervised Machine Learning Algorithms in Predicting Software Refactoring," IEEE Transactions on Software Engineering, 2020.
- Ana Bibiano, Vinicius Soares, Daniel Coutinho, Eduardo Fernandes, João Correia, Kleber Tarcísio, Anderson Oliveira, Alessandro Garcia, Rohit Gheyi, Marcio Ribeiro, Baldoino Fonseca, Caio Barbosa, and Daniel Oliveira, "How Does Incomplete Composite Refactoring Affect Internal Quality Attributes?," 28th IEEE International Conference on Program Comprehension (ICPC 2020), Seoul, South Korea, 2020.
- Leonardo Sousa, Willian Oizumi, Alessandro Garcia, Anderson Oliveira, Diego Cedrim, and Carlos Lucena, "When Are Smells Indicators of Architectural Refactoring Opportunities? A Study of 50 Software Projects," 28th IEEE International Conference on Program Comprehension (ICPC 2020), Seoul, South Korea, 2020.
- Devjeet Roy, Sarah Fakhoury, John Lee, and Venera Arnaoudova, "A Model to Detect Readability Improvements in Incremental Changes," 28th IEEE International Conference on Program Comprehension (ICPC 2020), Seoul, South Korea, 2020.
- Akira Fujimoto, Yoshiki Higo, Junnosuke Matsumoto, and Shinji Kusumoto, "Staged Tree Matching for Detecting Code Move across Files," 28th IEEE International Conference on Program Comprehension (ICPC 2020), Seoul, South Korea, 2020.
- Matheus Paixão, Anderson Uchôa, Ana Carla Bibiano, Daniel Oliveira, Alessandro Garcia, Jens Krinke, and Emilio Arvonio, "Behind the Intents: An In-depth Empirical Study on Software Refactoring in Modern Code Review," 17th International Conference on Mining Software Repositories (MSR 2020), Seoul, South Korea, 2020.
- Leonardo da Silva Sousa, Diego Cedrim, Alessandro Garcia, Willian Oizumi, Ana Carla Bibiano, Daniel Oliveira, Miryung Kim, and Anderson Oliveira, "Characterizing and Identifying Composite Refactorings: Concepts, Heuristics and Patterns," 17th International Conference on Mining Software Repositories (MSR 2020), Seoul, South Korea, 2020.
- Anthony Peruma, Christian D. Newman, Mohamed Wiem Mkaouer, Ali Ouni, and Fabio Palomba, "An Exploratory Study on the Refactoring of Unit Test Files in Android Applications," 4th International Workshop on Refactoring (IWoR 2020), Seoul, South Korea, 2020.
- Eman Abdullah AlOmar, Anthony Peruma, Christian D. Newman, Mohamed Wiem Mkaouer, and Ali Ouni, "On the Relationship Between Developer Experience and Refactoring: An Exploratory Study and Preliminary Results," 4th International Workshop on Refactoring (IWoR 2020), Seoul, South Korea, 2020.
- Yoshiki Higo, Shinpei Hayashi, and Shinji Kusumoto, "On Tracking Java Methods with Git Mechanisms," Journal of Systems and Software, Volume 165, July 2020.
- Eduardo Fernandes, Alexander Chávez, Alessandro Garcia, Isabella Ferreira, Diego Cedrim, Leonardo Sousa, and Willian Oizumi, "Refactoring Effect on Internal Quality Attributes: What Haven't They Told You Yet?," Information and Software Technology, 2020.
- Rrezarta Krasniqi, and Jane Cleland-Huang, "Enhancing Source Code Refactoring Detection with Explanations from Commit Messages," IEEE 27th International Conference on Software Analysis, Evolution and Reengineering (SANER 2020), London, ON, Canada, February 18-21, 2020.
- Anthony Peruma, Mohamed Wiem Mkaouer, Michael J.Decker, and Christian D.Newman, "Contextualizing rename decisions using refactorings, commit messages, and data types," Journal of Systems and Software, Volume 169, November 2020.
- Lerina Aversano, Umberto Carpenito, and Martina Iammarino, "An Empirical Study on the Evolution of Design Smells," Information, vol. 11, no. 7:348, 2020.
- Jevgenija Pantiuchina, Fiorella Zampetti, Simone Scalabrino, Valentina Piantadosi, Rocco Oliveto, Gabriele Bavota, and Massimiliano Di Penta, "Why Developers Refactor Source Code: A Mining-based Study," ACM Transactions on Software Engineering and Methodology, Volume 29, Issue 4, Article 29, September 2020.
- Ally S. Nyamawe, Hui Liu, Nan Niu, Qasim Umer, and Zhendong Niu, "Feature requests-based recommendation of software refactorings," Empirical Software Engineering, Volume 25, pp. 4315–4347, 2020.
- Eman Abdullah AlOmar, Mohamed Wiem Mkaouer, and Ali Ouni, "Toward the automatic classification of Self-Affirmed Refactoring," Journal of Systems and Software, Volume 171, January 2021.
- Vinícius Soares, Anderson Oliveira, Juliana Alves Pereira, Ana Carla Bibano, Alessandro Garcia, Paulo Roberto Farah, Silvia Regina Vergilio, Marcelo Schots, Caio Silva, Daniel Coutinho, Daniel Oliveira, and Anderson Uchôa, "On the Relation between Complexity, Explicitness, Effectiveness of Refactorings and Non-Functional Concerns," 34th Brazilian Symposium on Software Engineering (SBES 2020), October 19–23, 2020.
- Willian Oizumi, Diego Cedrim, Leonardo Sousa, Ana Carla Bibiano, Anderson Oliveira, Alessandro Garcia, and Daniel Oliveira, "Recommending Composite Refactorings for Smell Removal: Heuristics and Evaluation," 34th Brazilian Symposium on Software Engineering (SBES 2020), October 19–23, 2020.
- Massimiliano Di Penta, Gabriele Bavota, and Fiorella Zampetti, "On the Relationship between Refactoring Actions and Bugs: A Differentiated Replication," ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2020), Sacramento, California, United States, November 8-13, 2020.
- Ameya Ketkar, Nikolaos Tsantalis, and Danny Dig, "Understanding Type Changes in Java," ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2020), Sacramento, California, United States, November 8-13, 2020.
- Zhongxin Liu, Xin Xia, Meng Yan, and Shanping Li, "Automating Just-In-Time Comment Updating," 35th IEEE/ACM International Conference on Automated Software Engineering (ASE 2020), September 21–25, 2020.
- Zadia Codabux and Christopher Dutchyn, "Profiling Developers Through the Lens of Technical Debt," ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM 2020), October 8–9, 2020, Bari, Italy.
- Yiming Tang, Raffi Khatchadourian, Mehdi Bagherzadeh, Rhia Singh, Ajani Stewart, and Anita Raja, "An Empirical Study of Refactorings and Technical Debt in Machine Learning Systems," 43rd International Conference on Software Engineering (ICSE 2021), Madrid, Spain, May 25-28, 2021.
- Dong Jae Kim, Nikolaos Tsantalis, Tse-Hsun (Peter) Chen, and Jinqiu Yang, "Studying Test Annotation Maintenance in the Wild," 43rd International Conference on Software Engineering (ICSE 2021), Madrid, Spain, May 25-28, 2021.
- Yanjie Jiang, Hui Liu, Nan Niu, Lu Zhang, and Yamin Hu, "Extracting Concise Bug-Fixing Patches from Human-Written Patches in Version Control Systems," 43rd International Conference on Software Engineering (ICSE 2021), Madrid, Spain, May 25-28, 2021.
- Giovanni Rosa, Luca Pascarella, Simone Scalabrino, Rosalia Tufano, Gabriele Bavota, Michele Lanza, and Rocco Oliveto, "Evaluating SZZ Implementations Through a Developer-informed Oracle," 43rd International Conference on Software Engineering (ICSE 2021), Madrid, Spain, May 25-28, 2021.
- Bo Shen, Wei Zhang, Christian Kästner, Haiyan Zhao, Zhao Wei, Guangtai Liang, and Zhi Jin, "SmartCommit: a graph-based interactive assistant for activity-oriented commits," 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2021), Athens, Greece, August 23-28, 2021.
- Dimitrios Tsoukalas, Nikolaos Mittas, Alexander Chatzigeorgiou, Dionysios Kehagias, Apostolos Ampatzoglou, Theodoros Amanatidis, and Lefteris Angelis, "Machine Learning for Technical Debt Identification," IEEE Transactions on Software Engineering, 2021.
- Luca Traini, Daniele Di Pompeo, Michele Tucci, Bin Lin, Simone Scalabrino, Gabriele Bavota, Michele Lanza, Rocco Oliveto, and Vittorio Cortellessa, "How Software Refactoring Impacts Execution Time," ACM Transactions on Software Engineering and Methodology, Volume 31, Issue 2, Article 25, pp. 1-23, April 2022.
- Jarosław Pokropiński, Jakub Gąsiorek, Patryk Kramarczyk, and Lech Madeyski, "SZZ Unleashed-RA-C: An Improved Implementation of the SZZ Algorithm and Empirical Comparison with Existing Open Source Solutions," Developments in Information & Knowledge Management for Business Applications : Volume 3, Springer International Publishing, pp. 181-199, 2022.
- Eman Abdullah AlOmar, Jiaqian Liu, Kenneth Addo, Mohamed Wiem Mkaouer, Christian Newman, Ali Ouni, and Zhe Yu, "On the documentation of refactoring types," Automated Software Engineering, Volume 29, Article 9, 2022.
- Giulia Sellitto, Emanuele Iannone, Zadia Codabux, Valentina Lenarduzzi, Andrea De Lucia, Fabio Palomba, and Filomena Ferrucci, "Toward Understanding the Impact of Refactoring on Program Comprehension," 29th IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER 2022), Honolulu, Hawaii, USA, March 15-18, 2022.
- Eman Abdullah AlOmar, Tianjia Wang, Vaibhavi Raut, Mohamed Wiem Mkaouer, Christian Newman, and Ali Ouni, "Refactoring for Reuse: An Empirical Study," arXiv:2111.07002v1, 13 Nov 2021.
- Anton Ivanov, Zarina Kurbatova, Yaroslav Golubev, Andrey Kirilenko, and Timofey Bryksin, "AntiCopyPaster: Extracting Code Duplicates As Soon As They Are Introduced in the IDE," arXiv:2112.15230v1, 30 Dec 2021.
- Max Ellis, Sarah Nadi, and Danny Dig, "A Systematic Comparison of Two Refactoring-aware Merging Techniques," arXiv:2112.10370v1, 20 Dec 2021.
- KotlinRMiner has been developed by JetBrains Research. The project is led and maintained by Zarina Kurbatova.
- PyRef has been developed by Hassan Atwi and Bin Lin from the Software Institute at USI - Università della Svizzera Italiana, Switzerland.
- Py-RefactoringMiner has been developed by Malinda Dilhara, a Ph.D. student in the department of Computer Science at University of Colorado Boulder under the suprevision of Danny Dig.
The code in package gr.uom.java.xmi.* is developed by Nikolaos Tsantalis.
The code in package org.refactoringminer.* was initially developed by Danilo Ferreira e Silva and later extended by Nikolaos Tsantalis.
RefactoringMiner can automatically detect refactorings in the entire history of git repositories, between specified commits or tags, or at specified commits.
In the code snippet below we demonstrate how to print all refactorings performed in the toy project https://github.com/danilofes/refactoring-toy-example.git.
GitService gitService = new GitServiceImpl();
GitHistoryRefactoringMiner miner = new GitHistoryRefactoringMinerImpl();
Repository repo = gitService.cloneIfNotExists(
"tmp/refactoring-toy-example",
"https://github.com/danilofes/refactoring-toy-example.git");
miner.detectAll(repo, "master", new RefactoringHandler() {
@Override
public void handle(String commitId, List<Refactoring> refactorings) {
System.out.println("Refactorings at " + commitId);
for (Refactoring ref : refactorings) {
System.out.println(ref.toString());
}
}
});
You can also analyze between commits using detectBetweenCommits
or between tags using detectBetweenTags
. RefactoringMiner will iterate through all non-merge commits from start commit/tag to end commit/tag.
// start commit: 819b202bfb09d4142dece04d4039f1708735019b
// end commit: d4bce13a443cf12da40a77c16c1e591f4f985b47
miner.detectBetweenCommits(repo,
"819b202bfb09d4142dece04d4039f1708735019b", "d4bce13a443cf12da40a77c16c1e591f4f985b47",
new RefactoringHandler() {
@Override
public void handle(String commitId, List<Refactoring> refactorings) {
System.out.println("Refactorings at " + commitId);
for (Refactoring ref : refactorings) {
System.out.println(ref.toString());
}
}
});
// start tag: 1.0
// end tag: 1.1
miner.detectBetweenTags(repo, "1.0", "1.1", new RefactoringHandler() {
@Override
public void handle(String commitId, List<Refactoring> refactorings) {
System.out.println("Refactorings at " + commitId);
for (Refactoring ref : refactorings) {
System.out.println(ref.toString());
}
}
});
It is possible to analyze a specifc commit using detectAtCommit
instead of detectAll
. The commit
is identified by its SHA key, such as in the example below:
miner.detectAtCommit(repo, "05c1e773878bbacae64112f70964f4f2f7944398", new RefactoringHandler() {
@Override
public void handle(String commitId, List<Refactoring> refactorings) {
System.out.println("Refactorings at " + commitId);
for (Refactoring ref : refactorings) {
System.out.println(ref.toString());
}
}
});
You can get the churn of a specific commit using churnAtCommit
as follows:
Churn churn = miner.churnAtCommit(repo, "05c1e773878bbacae64112f70964f4f2f7944398", handler);
There is also a lower level API that compares the Java files in two directories containing the code before and after some changes:
UMLModel model1 = new UMLModelASTReader(new File("/path/to/version1")).getUmlModel();
UMLModel model2 = new UMLModelASTReader(new File("/path/to/version2")).getUmlModel();
UMLModelDiff modelDiff = model1.diff(model2);
List<Refactoring> refactorings = modelDiff.getRefactorings();
To use this API, please provide a valid OAuth token in the github-oauth.properties
file.
You can generate an OAuth token in GitHub Settings
-> Developer settings
-> Personal access tokens
.
If you don't want to clone locally the repository, you can use the following code snippet:
GitHistoryRefactoringMiner miner = new GitHistoryRefactoringMinerImpl();
miner.detectAtCommit("https://github.com/danilofes/refactoring-toy-example.git",
"36287f7c3b09eff78395267a3ac0d7da067863fd", new RefactoringHandler() {
@Override
public void handle(String commitId, List<Refactoring> refactorings) {
System.out.println("Refactorings at " + commitId);
for (Refactoring ref : refactorings) {
System.out.println(ref.toString());
}
}
}, 10);
If you want to analyze all commits of a pull request, you can use the following code snippet:
GitHistoryRefactoringMiner miner = new GitHistoryRefactoringMinerImpl();
miner.detectAtPullRequest("https://github.com/apache/drill.git", 1807, new RefactoringHandler() {
@Override
public void handle(String commitId, List<Refactoring> refactorings) {
System.out.println("Refactorings at " + commitId);
for (Refactoring ref : refactorings) {
System.out.println(ref.toString());
}
}
}, 10);
All classes implementing the Refactoring
interface include refactoring-specific location information.
For example, ExtractOperationRefactoring
offers the following methods:
getSourceOperationCodeRangeBeforeExtraction()
: Returns the code range of the source method in the parent commitgetSourceOperationCodeRangeAfterExtraction()
: Returns the code range of the source method in the child commitgetExtractedOperationCodeRange()
: Returns the code range of the extracted method in the child commitgetExtractedCodeRangeFromSourceOperation()
: Returns the code range of the extracted code fragment from the source method in the parent commitgetExtractedCodeRangeToExtractedOperation()
: Returns the code range of the extracted code fragment to the extracted method in the child commitgetExtractedOperationInvocationCodeRange()
: Returns the code range of the invocation to the extracted method inside the source method in the child commit
Each method returns a CodeRange
object including the following properties:
String filePath
int startLine
int endLine
int startColumn
int endColumn
Alternatively, you can use the methods List<CodeRange> leftSide()
and List<CodeRange> rightSide()
to get a list of CodeRange
objects for the left side (i.e., parent commit) and right side (i.e., child commit) of the refactoring, respectively.
All method-related refactoring (Extract/Inline/Move/Rename/ExtractAndMove Operation) objects come with a UMLOperationBodyMapper
object, which can be obtained by calling method getBodyMapper()
on the refactoring object.
Let's consider the Extract Method refactoring in commit JetBrains/intellij-community@7ed3f27
#1. You can use the following code snippet to obtain the newly added statements in the extracted method:
ExtractOperationRefactoring refactoring = ...;
UMLOperationBodyMapper mapper = refactoring.getBodyMapper();
List<StatementObject> newLeaves = mapper.getNonMappedLeavesT2(); //newly added leaf statements
List<CompositeStatementObject> newComposites = mapper.getNonMappedInnerNodesT2(); //newly added composite statements
List<StatementObject> deletedLeaves = mapper.getNonMappedLeavesT1(); //deleted leaf statements
List<CompositeStatementObject> deletedComposites = mapper.getNonMappedInnerNodesT1(); //deleted composite statements
For the Extract Method Refactoring example shown above mapper.getNonMappedLeavesT2()
returns the following statements:
final String url = pageNumber == 0 ? "courses" : "courses?page=" + String.valueOf(pageNumber);
final CoursesContainer coursesContainer = getFromStepic(url,CoursesContainer.class);
return coursesContainer.meta.containsKey("has_next") && coursesContainer.meta.get("has_next") == Boolean.TRUE;
#2. You can use the following code snippet to obtain the matched statements between the original and the extracted methods:
ExtractOperationRefactoring refactoring = ...;
UMLOperationBodyMapper mapper = refactoring.getBodyMapper();
for(AbstractCodeMapping mapping : mapper.getMappings()) {
AbstractCodeFragment fragment1 = mapping.getFragment1();
AbstractCodeFragment fragment2 = mapping.getFragment2();
Set<Replacement> replacements = mapping.getReplacements();
for(Replacement replacement : replacements) {
String valueBefore = replacement.getBefore();
String valueAfter = replacement.getAfter();
ReplacementType type = replacement.getType();
}
}
For the Extract Method Refactoring example shown above mapping.getReplacements()
returns the following AST node replacement for the pair of matched statements:
final List<CourseInfo> courseInfos = getFromStepic("courses",CoursesContainer.class).courses;
final List<CourseInfo> courseInfos = coursesContainer.courses;
Replacement: getFromStepic("courses",CoursesContainer.class)
-> coursesContainer
ReplacementType: VARIABLE_REPLACED_WITH_METHOD_INVOCATION
#3. You can use the following code snippet to obtain the overlapping refactorings in the extracted method:
ExtractOperationRefactoring refactoring = ...;
UMLOperationBodyMapper mapper = refactoring.getBodyMapper();
Set<Refactoring> overlappingRefactorings = mapper.getRefactorings();
For the Extract Method Refactoring example shown above mapper.getRefactorings()
returns the following refactoring:
Extract Variable coursesContainer : CoursesContainer
in method
private addCoursesFromStepic(result List<CourseInfo>, pageNumber int) : boolean
from class com.jetbrains.edu.stepic.EduStepicConnector
because variable coursesContainer = getFromStepic(url,CoursesContainer.class)
has been extracted from the following statement of the original method by replacing string literal "courses"
with variable url
:
final List<CourseInfo> courseInfos = getFromStepic("courses",CoursesContainer.class).courses;
When you build a distributable application with ./gradlew distZip
, you can run Refactoring Miner as a command line application. Extract the file under build/distribution/RefactoringMiner.zip
in the desired location, and cd into the bin
folder (or include it in your path). Then, run RefactoringMiner -h
to show its usage:
> RefactoringMiner -h
-h Show options
-a <git-repo-folder> <branch> -json <path-to-json-file> Detect all refactorings at <branch> for <git-repo-folder>. If <branch> is not specified, commits from all branches are analyzed.
-bc <git-repo-folder> <start-commit-sha1> <end-commit-sha1> -json <path-to-json-file> Detect refactorings between <start-commit-sha1> and <end-commit-sha1> for project <git-repo-folder>
-bt <git-repo-folder> <start-tag> <end-tag> -json <path-to-json-file> Detect refactorings between <start-tag> and <end-tag> for project <git-repo-folder>
-c <git-repo-folder> <commit-sha1> -json <path-to-json-file> Detect refactorings at specified commit <commit-sha1> for project <git-repo-folder>
-gc <git-URL> <commit-sha1> <timeout> -json <path-to-json-file> Detect refactorings at specified commit <commit-sha1> for project <git-URL> within the given <timeout> in seconds. All required information is obtained directly from GitHub using the OAuth token in github-oauth.properties
-gp <git-URL> <pull-request> <timeout> -json <path-to-json-file> Detect refactorings at specified pull request <pull-request> for project <git-URL> within the given <timeout> in seconds for each commit in the pull request. All required information is obtained directly from GitHub using the OAuth token in github-oauth.properties
With a locally cloned repository, run:
> git clone https://github.com/danilofes/refactoring-toy-example.git refactoring-toy-example
> ./RefactoringMiner -c refactoring-toy-example 36287f7c3b09eff78395267a3ac0d7da067863fd
If you don't want to clone locally the repository, run:
> ./RefactoringMiner -gc https://github.com/danilofes/refactoring-toy-example.git 36287f7c3b09eff78395267a3ac0d7da067863fd 10
For all options you can add the -json <path-to-json-file>
command arguments to save the JSON output in a file. The results are appended to the file after each processed commit.
For the -gc
and -gp
options you must provide a valid OAuth token in the github-oauth.properties
file stored in the bin
folder.
You can generate an OAuth token in GitHub Settings
-> Developer settings
-> Personal access tokens
.
In both cases, you will get the output in JSON format:
{
"commits": [{
"repository": "https://github.com/danilofes/refactoring-toy-example.git",
"sha1": "36287f7c3b09eff78395267a3ac0d7da067863fd",
"url": "https://github.com/danilofes/refactoring-toy-example/commit/36287f7c3b09eff78395267a3ac0d7da067863fd",
"refactorings": [{
"type": "Pull Up Attribute",
"description": "Pull Up Attribute private age : int from class org.animals.Labrador to class org.animals.Dog",
"leftSideLocations": [{
"filePath": "src/org/animals/Labrador.java",
"startLine": 5,
"endLine": 5,
"startColumn": 14,
"endColumn": 21,
"codeElementType": "FIELD_DECLARATION",
"description": "original attribute declaration",
"codeElement": "age : int"
}],
"rightSideLocations": [{
"filePath": "src/org/animals/Dog.java",
"startLine": 5,
"endLine": 5,
"startColumn": 14,
"endColumn": 21,
"codeElementType": "FIELD_DECLARATION",
"description": "pulled up attribute declaration",
"codeElement": "age : int"
}]
},
{
"type": "Pull Up Attribute",
"description": "Pull Up Attribute private age : int from class org.animals.Poodle to class org.animals.Dog",
"leftSideLocations": [{
"filePath": "src/org/animals/Poodle.java",
"startLine": 5,
"endLine": 5,
"startColumn": 14,
"endColumn": 21,
"codeElementType": "FIELD_DECLARATION",
"description": "original attribute declaration",
"codeElement": "age : int"
}],
"rightSideLocations": [{
"filePath": "src/org/animals/Dog.java",
"startLine": 5,
"endLine": 5,
"startColumn": 14,
"endColumn": 21,
"codeElementType": "FIELD_DECLARATION",
"description": "pulled up attribute declaration",
"codeElement": "age : int"
}]
},
{
"type": "Pull Up Method",
"description": "Pull Up Method public getAge() : int from class org.animals.Labrador to public getAge() : int from class org.animals.Dog",
"leftSideLocations": [{
"filePath": "src/org/animals/Labrador.java",
"startLine": 7,
"endLine": 9,
"startColumn": 2,
"endColumn": 3,
"codeElementType": "METHOD_DECLARATION",
"description": "original method declaration",
"codeElement": "public getAge() : int"
}],
"rightSideLocations": [{
"filePath": "src/org/animals/Dog.java",
"startLine": 7,
"endLine": 9,
"startColumn": 2,
"endColumn": 3,
"codeElementType": "METHOD_DECLARATION",
"description": "pulled up method declaration",
"codeElement": "public getAge() : int"
}]
},
{
"type": "Pull Up Method",
"description": "Pull Up Method public getAge() : int from class org.animals.Poodle to public getAge() : int from class org.animals.Dog",
"leftSideLocations": [{
"filePath": "src/org/animals/Poodle.java",
"startLine": 7,
"endLine": 9,
"startColumn": 2,
"endColumn": 3,
"codeElementType": "METHOD_DECLARATION",
"description": "original method declaration",
"codeElement": "public getAge() : int"
}],
"rightSideLocations": [{
"filePath": "src/org/animals/Dog.java",
"startLine": 7,
"endLine": 9,
"startColumn": 2,
"endColumn": 3,
"codeElementType": "METHOD_DECLARATION",
"description": "pulled up method declaration",
"codeElement": "public getAge() : int"
}]
}
]
}]
}