pesean's Stars
sgl-project/sglang
SGLang is a fast serving framework for large language models and vision language models.
HuaizhengZhang/AI-System-School
🚀 Awesome System for Machine Learning ⚡️ AI System Papers and Industry Practice. ⚡️ System for Machine Learning, LLM (Large Language Model), GenAI (Generative AI). 🍻 OSDI, NSDI, SIGCOMM, SoCC, MLSys, etc. 🗃️ Llama3, Mistral, etc. 🧑💻 Video Tutorials.
state-spaces/mamba
Mamba SSM architecture
THUDM/ChatGLM3
ChatGLM3 series: Open Bilingual Chat LLMs | 开源双语对话语言模型
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
mistralai/mistral-inference
Official inference library for Mistral models
mit-han-lab/streaming-llm
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
ray-project/ray
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
FasterDecoding/Medusa
Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads
flexflow/FlexFlow
FlexFlow Serve: Low-Latency, High-Performance LLM Serving
microsoft/Llama-2-Onnx
hiyouga/FastEdit
🩹Editing large language models within 10 seconds⚡
ztxz16/fastllm
纯c++的全平台llm加速库,支持python调用,chatglm-6B级模型单卡可达10000+token / s,支持glm, llama, moss基座,手机端流畅运行
tpoisonooo/llama.onnx
LLaMa/RWKV onnx models, quantization and testcase
THUDM/WebGLM
WebGLM: An Efficient Web-enhanced Question Answering System (KDD 2023)
excalidraw/excalidraw
Virtual whiteboard for sketching hand-drawn like diagrams
facebookresearch/faiss
A library for efficient similarity search and clustering of dense vectors.
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
apache/parquet-java
Apache Parquet Java
hkust-nlp/ceval
Official github repo for C-Eval, a Chinese evaluation suite for foundation models [NeurIPS 2023]
mosaicml/llm-foundry
LLM training code for Databricks foundation models
suno-ai/bark
🔊 Text-Prompted Generative Audio Model
EgoAlpha/prompt-in-context-learning
Awesome resources for in-context learning and prompt engineering: Mastery of the LLMs such as ChatGPT, GPT-3, and FlanT5, with up-to-date and cutting-edge updates.
microsoft/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
nomic-ai/gpt4all
GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.
qwopqwop200/GPTQ-for-LLaMa
4 bits quantization of LLaMA using GPTQ
brightmart/nlp_chinese_corpus
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
LianjiaTech/BELLE
BELLE: Be Everyone's Large Language model Engine(开源中文对话大模型)
PhoebusSi/Alpaca-CoT
We unified the interfaces of instruction-tuning data (e.g., CoT data), multiple LLMs and parameter-efficient methods (e.g., lora, p-tuning) together for easy use. We welcome open-source enthusiasts to initiate any meaningful PR on this repo and integrate as many LLM related technologies as possible. 我们打造了方便研究人员上手和使用大模型等微调平台,我们欢迎开源爱好者发起任何有意义的pr!
THUDM/ChatGLM-6B
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型