/CQIL

Learning Code-Query Interaction for Enhancing Code Searches

Primary LanguagePythonMIT LicenseMIT

Learning Code-Query Interaction for Enhancing Code Searches

This repository includes the source code for the paper 'Learning Code-Query Interaction for Enhancing Code Searches'.

Dependency

Tested in Ubuntu 18.04

Install all the dependent packages via pip:

$ pip install -r requirements.txt

Dataset

Download Datasets

we use two datasets:

  1. CODEnn[1] could be downloaded from Google Drive
  2. Cosbench[2] could be downloaded from Google Drive

Download the dataset and replace files in the /data folder.

The /data/example folder provides a small sample dataset for quick deploying.

Data Preparation

To generate preprocessed data:

python pipeline.py

Train & Evaluate

To train our model:

python main.py --mode train

To evaluate our model:

python main.py --mode eval

References

[1] X. Gu, H. Zhang, and S. Kim, “Deep code search,” in 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE). IEEE, 2018, pp. 933–944.

[2] S. Yan, H. Yu, Y. Chen, B. Shen, and L. Jiang, “Are the code snippets what we are searching for? a benchmark and an empirical study on code search with natural-language queries,” in 2020 IEEE 27th International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 2020, pp. 344–354.