- General info
- How to build PyEvolve
- Research
- API usage guidelines
- Running PyEvolve in VirtualBox Image
PyEvolve automates the frequently repeated code changes in Python systems. We build a pipeline for mining and automating best code evolution practices, ensuring that the your project does not fall behind. The following is a high-level overview of the pipeline.
- Phase 1: We use R-CPATMiner to mine best practices from version history of Python systems. You can find details on executing R-CPATMiner in its project description.
- Phase 2: We use InferRules to infer the initial transformation rules for the patterns detected by R-CPATMiner.
- Phase 3: PyEvolve use rule as input to identify the potential sites to apply the patterns in the target codes.
- Phase 4: PyEvolve infer final adapted rules to tranplant the identified pattern.
To have a fully built PyEvolve, you have to install the following components.
- We use RulePharser to generate an AST for Comby templates that includes both Python and Comby syntaxes. Follow the steps in RulePharser to build it locally and add it to your local maven repository.
- We use ComBy as a backend tool to rewrite code. Please follow the steps in their documentation to install it on your PC.
- We use wala.ml. To install it, follow the instructions on wala.ml.
After completing the above steps, run ./gradlew
build from the root directory to build the project. This will build the project and execute the test cases. If you want to build the project without running the tests, use the command ./gradlew build -x test
.
In this example, we identify two usages of the following pattern in the project PatternTest
Pattern :
# type :[l3] : List[int]"
# type :[[l1]] : int"
# type :[[l2]] : int"
:[[l1]] = 0"
for :[[l2]] in :[l3]:
:[[l1]]=:[[l1]]+:[[l2]]
- Clone the project PyEvolve
- Update
Configurations.PROJECT_REPOSITORY
andConfigurations.TYPE_REPOSITORY
configurations with valid paths. - Clone the project PatternTest to the directory
Configurations.PROJECT_REPOSITORY + "/pythonInfer/"
- Execute the script type_infer.py to infer type information of the project PatternTest
- Copy the generated
json
files into the folderConfigurations.TYPE_REPOSITORY + "/pythonInfer/PatternTest/"
It is important to have the directory structure of the folders Configurations.PROJECT_REPOSITORY
and Configurations.TYPE_REPOSITORY
as below. For example, we consider the project pythonInfer/PatternTest
.
Configurations.PROJECT_REPOSITORY
|
|--pythonInfer
|
|--PatternTest
|
|- file1.py
|- file2.py
Configurations.TYPE_REPOSITORY
|
|--pythonInfer
|
|--PatternTest
|
|- file1.json
|- file2.json
- Following code can be used to get all the usages of the Pattern
import com.utils.Utils;
import org.junit.jupiter.api.Test;
import static com.matching.fgpdg.Configurations.PROJECT_REPOSITORY; // This must be set to root directory of the folder "pythonInfer"
public class DetectPattern {
@Test
void testPattern() throws Exception {
String pattern = "# type number : Any\n" +
"# type :[l3] : List[int]\n" +
"# type :[[l1]] : int\n" +
"# type :[[l2]] : int\n" +
":[[l1]] = 0\n" +
"for :[[l2]] in :[l3]:\n" +
" :[[l1]]=:[[l1]]+:[[l2]]";
String outPath = "./OUTPUT/";
String projectPath = PROJECT_REPOSITORY +"pythonInfer/PatternTest";
System.out.println(pattern);
Utils.processProjectForPattern(projectPath,pattern,outPath);
}
}
PyEvolve can assist you if you have a rule encoded in ComBy syntax and want to transplant it to a Python file. First, you should follow the steps described in the section "Using PyEvolve to identify code usages" to prepare the data and arrange them into folders.
void transplantPatternToFile() {
Configurations.TYPE_REPOSITORY = ""; // this should the path to your type repository
Configurations.PROJECT_REPOSITORY = "" // this should be the path to your projects repository
String projectFile = ""; // this should be the path to your file, this path should be the relative path to Configurations.PROJECT_REPOSITORY
String LHS = ""; // path to the LHS of the rule file
String RHS = ""; // path to the RHS of the rule file
String adaptedFile = MainAdaptor.transplantPatternToFile(projectFile, LHS, RHS); // adaptedFile is the adapted file
}
PyEvolve can assist you if you have a rule encoded in ComBy syntax and want to transplant it to a perticular Python function.
void transplantPatternToFunction() {
Configurations.TYPE_REPOSITORY = ""; // this should the path to your type repository
Configurations.PROJECT_REPOSITORY = "" // this should be the path to your projects repository
String projectFile = " "; // this should be the path to your file, this path should be the relative path to Configurations.PROJECT_REPOSITOR
String LHS = ""; // path to the LHS of the rule file
String RHS = ""; // path to the RHS of the rule file
Module codeModule = com.utils.Utils.getPythonModule(projectFile);
stmt stmt = codeModule.getInternalBody().get(i); // i must be the statment number of the function.
List<org.python.antlr.base.stmt> imports = codeModule.getInternalBody().stream().filter(x -> x instanceof Import
|| x instanceof ImportFrom).collect(Collectors.toList());
String adaptedFunction = MainAdaptor.transplantPatternToFunction(projectFile, (FunctionDef) stmt,imports,LHS, RHS);
}
- Download VirtualBox from https://www.virtualbox.org
- Download the VirtualBox image from Link. This virtual machine runs MacOS and requires 50 GB of free space on your system. It has all of the required dependencies so that you can start using the tools right away.
- Use the steps given in this link to execute the tool.
We will add citation information as soon as possible.
If you are using PyEvolve in your research, please cite the following papers:
Malinda Dilhara, Danny Dig, and Ameya Ketkar, PyEvolve: Automating Frequent Code Changes in Python ML Systems," 45th International Conference on Software Engineering (ICSE 2022), Melbourne, Australia.
@inproceedings{Dilhara:ICSE:2023:PyEvolve,
author = {Dilhara, Malinda and Dig, Danny and Ketkar, Ameya},
title = {PyEvolve: Automating Frequent Code Changes in Python ML Systems},
booktitle = {Proceedings of the 45th International Conference on Software Engineering},
series = {ICSE '23},
year = {2023},
location = {Melbourne, Australia},
publisher = {ACM},
address = {New York, NY, USA},
}
All software provided in this repository is subject to the Apache License Version 2.0.