CGCL-codes/naturalcc

Is the Implementations part model reimplemented by yourselves

skye95git opened this issue · 4 comments

Thanks for your great work! I have a few questions:

  1. Is the Implementations part model reimplemented by yourselves, or is it the official open source implementation collected?
  2. The Deepcs link failed.
  3. In the Code Retrieval (Search) department, is there a pre-training implementation for CodeBERT or GraphCodeBERT?
  4. Does the preprocessing part of the dataset contain data flow graph and control flow graph corresponding to the code?

Hi sky95kit,
Thanks for you interest to our work. I will ask our team members to answer your questions. For the CFG and DFG part, we currently recommend you to our team members's tool SVF (https://github.com/SVF-tools/SVF).

Answers:

  1. Some of the models are open-source but are implemented in different platforms (such as Torch7 or TF). We translated them into NaturalCC, or re-implemented by papers or GitHub repos.
  2. We will check it out.
  3. The authors of CodeBERT and GraphCodeBERT do not release their pretraining script, and we do not have sufficient resources to re-implement them. We (MSRA) don't plan to release the pre-training code in the near future. :(
  4. Not yet. You can refer to Data-flow and control-flow graphs for Java.

Thanks for your great work! I have a few questions:

  1. Is the Implementations part model reimplemented by yourselves, or is it the official open source implementation collected?
  2. The Deepcs link failed.
  3. In the Code Retrieval (Search) department, is there a pre-training implementation for CodeBERT or GraphCodeBERT?
  4. Does the preprocessing part of the dataset contain data flow graph and control flow graph corresponding to the code?

Hi @skye95git
I noticed that you also have questions in the deepcs repo's issues Evaluation Benchmark on the trained model #16.
Have you tried to re-train DeepCS on the codesearchnet dataset? Maybe we can discuss it.

Thanks for your great work! I have a few questions:

  1. Is the Implementations part model reimplemented by yourselves, or is it the official open source implementation collected?
  2. The Deepcs link failed.
  3. In the Code Retrieval (Search) department, is there a pre-training implementation for CodeBERT or GraphCodeBERT?
  4. Does the preprocessing part of the dataset contain data flow graph and control flow graph corresponding to the code?

Hi @skye95git I noticed that you also have questions in the deepcs repo's issues Evaluation Benchmark on the trained model #16. Have you tried to re-train DeepCS on the codesearchnet dataset? Maybe we can discuss it.

Unfortunately, I only retrained the model on the data set mentioned in the paper.