ASY246's Stars
pengxiao-song/LaWGPT
🎉 Repo for LaWGPT, Chinese-Llama tuned with Chinese Legal knowledge. 基于中文法律知识的大语言模型
meta-llama/llama
Inference code for Llama models
hkust-nlp/ceval
Official github repo for C-Eval, a Chinese evaluation suite for foundation models [NeurIPS 2023]
lm-sys/FastChat
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
dstl/re3d
Relationship and Entity Extraction Evaluation Dataset
BlinkDL/ChatRWKV
ChatRWKV is like ChatGPT but powered by RWKV (100% RNN) language model, and open source.
Zjh-819/LLMDataHub
A quick guide (especially) for trending instruction finetuning datasets
Vision-CAIR/MiniGPT-4
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
openai/tiktoken
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
openai/evals
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
CLUEbenchmark/SuperCLUE
SuperCLUE: 中文通用大模型综合性基准 | A Benchmark for Foundation Models in Chinese
run-llama/llama_index
LlamaIndex is a data framework for your LLM applications
ymcui/Chinese-LLaMA-Alpaca
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
THUDM/ChatGLM-6B
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
OpenMOSS/MOSS
An open-source tool-augmented conversational language model from Fudan University
xcanwin/KeepChatGPT
这是一款提高ChatGPT的数据安全能力和效率的插件。并且免费共享大量创新功能,如:自动刷新、保持活跃、数据安全、取消审计、克隆对话、言无不尽、净化页面、展示大屏、拦截跟踪、日新月异、明察秋毫等。让我们的AI体验无比安全、顺畅、丝滑、高效、简洁。
chatchat-space/Langchain-Chatchat
Langchain-Chatchat(原Langchain-ChatGLM)基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen and Llama) RAG and Agent app with langchain
MiuLab/TC-Bot
User Simulation for Task-Completion Dialogues
EleutherAI/lm-evaluation-harness
A framework for few-shot evaluation of language models.
tatsu-lab/stanford_alpaca
Code and documentation to train Stanford's Alpaca models, and generate the data.
LianjiaTech/BELLE
BELLE: Be Everyone's Large Language model Engine(开源中文对话大模型)
esbatmop/MNBVC
MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。
CLUEbenchmark/CLUECorpus2020
Large-scale Pre-training Corpus for Chinese 100G 中文预训练语料
HarderThenHarder/transformers_tasks
⭐️ NLP Algorithms with transformers lib. Supporting Text-Classification, Text-Generation, Information-Extraction, Text-Matching, RLHF, SFT etc.
microsoft/Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
thu-coai/CDial-GPT
A Large-scale Chinese Short-Text Conversation Dataset and Chinese pre-training dialog models
microsoft/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
bigscience-workshop/Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
bigscience-workshop/petals
🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
f/awesome-chatgpt-prompts
This repo includes ChatGPT prompt curation to use ChatGPT better.