kongya's Stars
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
epfml/dynamic-sparse-flash-attention
OpenBMB/ToolBench
[ICLR'24 spotlight] An open platform for training, serving, and evaluating large language model for tool learning.
Watchful1/PushshiftDumps
Example scripts for the pushshift dump files
EleutherAI/the-pile
nouhadziri/THRED
The implementation of the paper "Augmenting Neural Response Generation with Context-Aware Topical Attention"
togethercomputer/RedPajama-Data
The RedPajama-Data repository contains code for preparing large datasets for training large language models.
EleutherAI/lm-evaluation-harness
A framework for few-shot evaluation of language models.
databrickslabs/dolly
Databricks’ Dolly, a large language model trained on the Databricks Machine Learning Platform
OpenMOSS/MOSS
An open-source tool-augmented conversational language model from Fudan University
lamini-ai/lamini
X-PLUG/mPLUG-Owl
mPLUG-Owl & mPLUG-Owl2: Modularized Multimodal Large Language Model
langchain-ai/langchain
🦜🔗 Build context-aware reasoning applications
Significant-Gravitas/AutoGPT
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
openai/evals
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
LAION-AI/Open-Instruction-Generalist
Open Instruction Generalist is an assistant trained on massive synthetic instructions to perform many millions of tasks
X-PLUG/ChatPLUG
A Chinese Open-Domain Dialogue System
yoheinakajima/babyagi
nichtdax/awesome-totally-open-chatgpt
A list of totally open alternatives to ChatGPT
yaodongC/awesome-instruction-dataset
A collection of open-source dataset to train instruction-following LLMs (ChatGPT,LLaMA,Alpaca)
Instruction-Tuning-with-GPT-4/GPT-4-LLM
Instruction Tuning with GPT-4
orhonovich/unnatural-instructions
thunlp/UltraChat
Large-scale, Informative, and Diverse Multi-round Chat Data (and Models)
radi-cho/datasetGPT
A command-line interface to generate textual and conversational datasets with LLMs.
radi-cho/botbots
A dataset featuring diverse dialogues between two ChatGPT (gpt-3.5-turbo) instances with system messages written by GPT-4. Covering various contexts and tasks (task-oriented dialogue systems, abstract reasoning, brainstorming).
teknium1/GPTeacher
A collection of modular datasets generated by GPT-4, General-Instruct - Roleplay-Instruct - Code-Instruct - and Toolformer
Facico/Chinese-Vicuna
Chinese-Vicuna: A Chinese Instruction-following LLaMA-based Model —— 一个中文低资源的llama+lora方案,结构参考alpaca
PhoebusSi/Alpaca-CoT
We unified the interfaces of instruction-tuning data (e.g., CoT data), multiple LLMs and parameter-efficient methods (e.g., lora, p-tuning) together for easy use. We welcome open-source enthusiasts to initiate any meaningful PR on this repo and integrate as many LLM related technologies as possible. 我们打造了方便研究人员上手和使用大模型等微调平台,我们欢迎开源爱好者发起任何有意义的pr!
project-baize/baize-chatbot
Let ChatGPT teach your own chatbot in hours with a single GPU!
microsoft/ContextualSP
Multiple paper open-source codes of the Microsoft Research Asia DKI group