Juanting-Xu's Stars
HqWu-HITCS/Awesome-Chinese-LLM
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
salesforce/CodeRL
This is the official code for the paper CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning (NeurIPS22).
datawhalechina/easy-rl
强化学习中文教程(蘑菇书🍄),在线阅读地址:https://datawhalechina.github.io/easy-rl/
LiSir-HIT/Reinforcement-Learning
kinds of reinforcement learning model by Pytorch
bigcode-project/starcoder
Home of StarCoder: fine-tuning & inference!
OpenMOSS/MOSS
An open-source tool-augmented conversational language model from Fudan University
jayelm/gisting
Learning to Compress Prompts with Gist Tokens - https://arxiv.org/abs/2304.08467
microsoft/DeepSpeedExamples
Example models using DeepSpeed
microsoft/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
google-deepmind/code_contests
tkarabela/bigpython
Source code for Big Python tutorials on YouTube
salesforce/CodeGen
CodeGen is a family of open-source model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex.
reddy-lab-code-research/XLCoST
Code and data for XLCoST: A Benchmark Dataset for Cross-lingual Code Intelligence
microsoft/JARVIS
JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/2303.17580.pdf
ntunlp/xCodeEval
xCodeEval: A Large Scale Multilingual Multitask Benchmark for Code Understanding, Generation, Translation and Retrieval
ntunlp/ExecEval
A distributed, extensible, secure solution for evaluating machine generated code with unit tests in multiple programming languages.
hendrycks/apps
APPS: Automated Programming Progress Standard (NeurIPS 2021)
PhoebusSi/Alpaca-CoT
We unified the interfaces of instruction-tuning data (e.g., CoT data), multiple LLMs and parameter-efficient methods (e.g., lora, p-tuning) together for easy use. We welcome open-source enthusiasts to initiate any meaningful PR on this repo and integrate as many LLM related technologies as possible. 我们打造了方便研究人员上手和使用大模型等微调平台,我们欢迎开源爱好者发起任何有意义的pr!
openai/chatgpt-retrieval-plugin
The ChatGPT Retrieval Plugin lets you easily find personal or work documents by asking questions in natural language.
THUDM/CodeGeeX
CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)
VHellendoorn/Code-LMs
Guide to using pre-trained large language models of source code
pvs-hd-tea/TCrules
Rule-Based Translation with TransCoder-ST Data
THUDM/ChatGLM-6B
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
monish001/CodeContests
GlowingNinja is a record of code from difference coding contests, be it codechef, topcoder, interviewstreet, etc.
LaverdeS/End_to_End_Code_Generation
Using CoNaLa Dataset
facebookresearch/CodeGen
Reference implementation of code generation projects from Facebook AI Research. General toolkit to apply machine learning to code, from dataset creation to model training and evaluation. Comes with pretrained models.
CoderEval/CoderEval
A collection of practical code generation tasks and tests in open source projects. Complementary to HumanEval by OpenAI.
microsoft/CodeXGLUE
CodeXGLUE
rbhatia4245/github_test_repos_analysis_bigquery
A collection of SQL queries used in Google BigQuery to find what percentage of github repos contain test files
google-research-datasets/Attributed-QA
We believe the ability of an LLM to attribute the text that it generates is likely to be crucial for both system developers and users in information-seeking scenarios. This release consists of human-rated system outputs for a new question-answering task, Attributed Question Answering (AQA).