/Awesome-Code-LLM

👨‍💻 An awesome & curated list of best code-LLM for research.

MIT LicenseMIT

👨‍💻 Awesome-Code-LLM Awesome PR COMMIT

🧵 Table of Contents

🚀 Leaderboard

Leaderboard (Sort by HumanEval Pass@1)

Rank Model Params HumanEval MBPP HF Paper
1 GPT-4 + Relexion ? 91.0 77.1 paper
2 GPT-4 ? 67.0 paper
3 Pangu-Coder2 15B 61.6 paper
4 WizardCoder-15B 15B 57.3 51.8 ckpt paper
5 GPT-3.5 ? 48.1 paper
6 Code-Davinci-002 ? 47.0 paper
7 StarCoder-15B (Prompted) 15B 40.8 49.5 ckpt paper
8 PaLM 2-S ? 37.6 50.0 paper
9 PaLM-Coder-540B 540B 36.0 47.0 paper
10 InstructCodeT5+ 16B 35.0 paper
11 StarCoder-15B 15B 33.6 52.7 ckpt paper
12 Code-Cushman-001 ? 33.5 45.9 paper
13 CodeT5+ 16B 30.9 paper
14 LLaMA2-70B 70B 29.9 ckpt paper
15 CodeGen-16B-Mono 16B 29.3 35.3 paper
16 PaLM-540B 540B 26.2 36.8 paper
17 LLaMA-65B 65B 23.7 37.7 paper
18 CodeGeeX 13B 22.9 24.4 paper
19 LLaMA-33B 33B 21.7 30.2 paper
20 CodeGen-16B-Multi 16B 18.3 20.9 paper
21 AlphaCode 1.1B 17.1 paper

💡 Toolkit:

📚 Paper

▶️ Pre-Training

  1. Evaluating Large Language Models Trained on Code Preprint

    [Paper] Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto. et al. , 2021.07

  2. CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis ICLR23

    [Paper] Erik Nijkamp, Bo Pang, Hiroaki Hayashi, Lifu Tu, Huan Wang, Yingbo Zhou, Silvio Savarese, Caiming Xiong , 2022.03

  3. CodeGen2: Lessons for Training LLMs on Programming and Natural Languages ICLR23

    [Paper] Erik Nijkamp, Hiroaki Hayashi, Caiming Xiong, Silvio Savarese, Yingbo Zhou , 2023.05

  4. SantaCoder: don't reach for the stars! Preprint

    [Paper] Loubna Ben Allal, Raymond Li, Denis Kocetkov, Chenghao Mou, Christopher Akiki, Carlos Munoz Ferrandis, Niklas Muennighoff. et al. , 2023.01

  5. StarCoder: may the source be with you! Preprint

    [Paper] Raymond Li, Loubna Ben Allal, Yangtian Zi, Niklas Muennighoff, Denis Kocetkov, Chenghao Mou. et al. , 2023.05

▶️ Instruction Tuning

  1. WizardCoder: Empowering Code Large Language Models with Evol-Instruct Preprint

    [Paper] Ziyang Luo, Can Xu, Pu Zhao, Qingfeng Sun, Xiubo Geng, Wenxiang Hu, Chongyang Tao, Jing Ma, Qingwei Lin, Daxin Jiang , 2023.07

▶️ Alignment with Feedback

  1. PanGu-Coder2: Boosting Large Language Models for Code with Ranking Feedback Preprint

    [Paper] Bo Shen, Jiaxin Zhang, Taihong Chen, Daoguang Zan, Bing Geng, An Fu, Muhan Zeng, Ailun Yu, Jichuan Ji, Jingyang Zhao, Yuenan Guo, Qianxiang Wang , 2023.07

▶️ Prompting

  1. CodeT: Code Generation with Generated Tests ICLR23

    [Paper] Bei Chen, Fengji Zhang, Anh Nguyen, Daoguang Zan, Zeqi Lin, Jian-Guang Lou, Weizhu Chen , 2022.07

  2. Coder Reviewer Reranking for Code Generation ICML23

    [Paper] Tianyi Zhang, Tao Yu, Tatsunori B Hashimoto, Mike Lewis, Wen-tau Yih, Daniel Fried, Sida I Wang , 2022.11

▶️ Evaluation & Benchmark

  1. Measuring Coding Challenge Competence With APPS NeurIPS21

    Named APPS

    [Paper][Repo] Dan Hendrycks, Steven Basart, Saurav Kadavath, Mantas Mazeika, Akul Arora, Ethan Guo, Collin Burns, Samir Puranik, Horace He, Dawn Song, Jacob Steinhardt , 2021.05

  2. Program Synthesis with Large Language Models Preprint

    Named MBPP

    [Paper] Jacob Austin, Augustus Odena, Maxwell Nye, Maarten Bosma, Henryk Michalewski, David Dohan, Ellen Jiang, Carrie Cai, Michael Terry, Quoc Le, Charles Sutton , 2021.08

🙌 Contributors

This is an active repository and your contributions are always welcome! If you have any question about this opinionated list, do not hesitate to contact me huybery@gmail.com.

Cite as

@software{awesome-code-llm,
  author = {Binyuan Hui},
  title = {An awesome and curated list of best code-LLM for research},
  howpublished = {\url{https://github.com/huybery/Awesome-Code-LLM}},
  year = 2023,
}

Acknowledgement

This project is inspired by Awesome-LLM.

Star History

Star History Chart

⬆ Back to ToC