code for our acl 2023 paper Understanding Programs by Exploiting (Fuzzing) Test Cases
torch 1.7.0
transformers 4.18.0
Download codebert-base and unixcoder-base, and move the files to ./microsoft/codebert-base
and ./microsoft/unixcoder-base
Given a code and a collection of candidates as the input, the task is to return Top K codes with the same semantic. Models are evaluated by MAP@R score. MAP@R is defined as the mean of average precision scores, each of which is evaluated for retrieving R most similar samples given a query. For a code (query), R is the number of other codes in the same class, i.e. R=499 in this dataset.
We take POJ-104 dataset on this task as an exmaple.
Download POJ dataset from programs.tar.gz and download POJ fuzzing data from POJ104.io.tar.gz.
mv programs.tar.gz POJ104.io.tar.gz ./dataset
cd dataset
tar zxvf programs.tar.gz
tar zxvf POJ104.io.tar.gz
mv fuzz output
python preprocess_clone.py
cd ..
After preprocessing dataset, you can obtain three .jsonl files, i.e. train_fuzz_clone.jsonl, valid_fuzz_clone.jsonl, test_fuzz_clone.jsonl.
The processed .jsonl files are also at ./dataset
.
For each file, each line in the uncompressed file represents one function. One row is illustrated below.
- code: the source code
- label: the number of problem that the source code solves
- index: the index of example
- fuzz: the fuzzing test cases including input and output pair
Data statistics of the dataset are shown in the below table:
#Problems | #Examples | |
---|---|---|
Train | 64 | 32,000 |
Dev | 16 | 8,000 |
Test | 24 | 12,000 |
We use the script provided from website to evaluate predictions for this task, and report MAP@R score.
We also provide a pipeline of FuzzTuning on this task in ./clone/run.sh
.
The code classification task requires that we assign the same label to programs that were implemented to solve the same problem and achieve the same goal. Models are evaluated by accuracy.
We take POJ-104 dataset on this task as an example.
The process for dataset is similar with clone detection unless
python preprocess_cls.py
After preprocessing dataset, you can obtain three .jsonl files, i.e. train_fuzz_cls.jsonl, valid_fuzz_cls.jsonl, test_fuzz_cls.jsonl.
The processed .jsonl files are also at ./dataset
.
For each file, each line in the uncompressed file represents one function. One row is illustrated below.
- code: the source code
- label: the number of problem that the source code solves
- index: the index of example
- fuzz: the fuzzing test cases including input and output pair
Data statistics of the dataset are shown in the below table:
#Examples | |
---|---|
Train | 31,338 |
Dev | 10,294 |
Test | 10,368 |
We also provide a pipeline of FuzzTuning on this task in ./cls/run.sh
.
Please cite our work in your publications if it helps your research:
@article{zhao2023understanding,
title={Understanding Programs by Exploiting (Fuzzing) Test Cases},
author={Zhao, Jianyu and Rong, Yuyang and Guo, Yiwen and He, Yifeng and Chen, Hao},
journal={arXiv preprint arXiv:2305.13592},
year={2023}
}