LLM PEFT for APR

Dependency

Python

Python 3.9.17
PyTorch 2.0.1
Huggingface transformers 4.35.2
wandb
pef 0.6.2

accelerate 0.24.1
datasets 2.13.0
trl
fire

nvitop

Others

Java 8

Content

The file structure of the artifact is as follow:

APR-INSTRUCTION_construct;

contains source code of constructing APR-INSTRUCTION ,base existing APR dataset[1]

codellama_7b_hf:

output: peft weights by different peft method(lora, p-tuning，prefix tuning , $(IA)^3$ and Full-model Fine-tuning
results: results of generated pacthes on benchmarks(Humaneval-Java, Defect4j and Quixbugs) inferencing by codellama-7b-hf and codellama-7b-hf with peft weights, validation results of generated pacthes

codellama_13b_hf:

output: peft weights by different peft method(lora, p-tuning，prefix tuning , $(IA)^3$
results: results of generated pacthes on benchmarks(Humaneval-Java, Defect4j and Quixbugs) inferencing by codellama-13b-hf and codellama-13b-hf with peft weights, validation results of generated pacthes

deepseek_coder_6.7b:

output: peft weights by different peft method(lora, p-tuning，prefix tuning , $(IA)^3$
results: results of generated pacthes on benchmarks(Humaneval-Java, Defect4j and Quixbugs) inferencing by Deepseek-Coder Base 6.7B and Deepseek-Coder Base 6.7B with peft weights, validation results of generated pacthes

llama2_7b_hf:

output: peft weights by different peft method(lora, p-tuning，prefix tuning , $(IA)^3$
results: results of generated pacthes on benchmarks(Humaneval-Java, Defect4j and Quixbugs) inferencing by Llama-2-7b-hf and Llama-2-7b-hf with peft weights, validation results of generated pacthes

instruction_tuning_dataset

Instruction Dataset used this paper
- apr_instruction_30k.json: the APR instruction dataset constructed this paper
- oss_instrcution_30k.json: 30k random selection of OSS-Instruction Dataset
- code_alpaca_20k.json: Code Alpaca Instruction Dataset
- The rest of data is used for RQ3 to explore the impact of training data size for performance, which is parted as 10k, 15k, 20k and 25k

inference_and_validation_src:

This directory consists of source code used for patches generation and validation of LLMs

file name	description
defects4j_patch_validate.py	patches generation and validation on Defects4j benchmark
humaneval_patch_validate.py	patches generation and validation on Humaneval-Java benchmark
quixbugs_patch_validate.py	patches generation and validation on Quixbugs benchmark
peft_patch_validation.py	Entry of model validation with PEFT methods, and then select different scripts for verification
fmft_generate_patch.py	Entry of model validation with Full-model fine-tuning, and then select different scripts for verification
generate_patch_infill.py	Entry of CodeLlama 7b validation with no fine-tuning and infill templates, and then select different scripts for verification
prompter.py	convert instances of benchmark to instruction
result_look.py	record $pass@k$ of each validation

inference_scripts:

This directory consists of bash scripts used for patches generation and validation of LLMs
each script is formed as model name_Fine-tuning method_instruction dataset of Fine-tuning_validation.sh

train_scripts:

This directory consists of bash scripts used for LLMs training
each script is formed as model name_instrcution_Fine-tuning method_hyper-parameters(Optional)_train_instruction dataset of Fine-tuning_validation.sh

train_src:

This directory consists of source code used for LLM trainnin

file name description

sfttrain_peft.py Training code for PEFT methods

sfttrain_ft.py Training code for Full-model Fine-tuning

prompter.py Add additional prompt for instruction

file name	description
sfttrain_peft.py	Training code for PEFT methods
sfttrain_ft.py	Training code for Full-model Fine-tuning
prompter.py	Add additional prompt for instruction

results_hyper_parameters:

This directory consists of results of patches generation and validation in experiments of RQ3

NOTICE

Due to the size of Fine-tuning weights is too large, so we do not upload them on Github now
Considering the anonymous review, we will release weights after review

Cites

[1] Zhu, Qihao, et al. "A syntax-guided edit decoder for neural program repair." Proceedings of the 29th ACM joint meeting on European software engineering conference and symposium on the foundations of software engineering. 2021.

zjulgc/llmpeft4apr

LLM PEFT for APR

Dependency

Python

Others

Content

APR-INSTRUCTION_construct;

codellama_7b_hf:

codellama_13b_hf:

deepseek_coder_6.7b:

llama2_7b_hf:

instruction_tuning_dataset

inference_and_validation_src:

inference_scripts:

train_scripts:

train_src:

results_hyper_parameters:

NOTICE

Cites