pesean

pesean's Stars

sgl-project/sglang
SGLang is a fast serving framework for large language models and vision language models.
Language:Python5.9k482
HuaizhengZhang/AI-System-School
🚀 Awesome System for Machine Learning ⚡️ AI System Papers and Industry Practice. ⚡️ System for Machine Learning, LLM (Large Language Model), GenAI (Generative AI). 🍻 OSDI, NSDI, SIGCOMM, SoCC, MLSys, etc. 🗃️ Llama3, Mistral, etc. 🧑‍💻 Video Tutorials.
2.7k307
state-spaces/mamba
Mamba SSM architecture
Language:Python13.1k1.1k
THUDM/ChatGLM3
ChatGLM3 series: Open Bilingual Chat LLMs | 开源双语对话语言模型
Language:Python13.5k1.6k
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python29.7k4.5k
mistralai/mistral-inference
Official inference library for Mistral models
Language:Jupyter Notebook9.7k860
mit-han-lab/streaming-llm
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
Language:Python6.6k363
ray-project/ray
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
Language:Python33.8k5.8k
FasterDecoding/Medusa
Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads
Language:Jupyter Notebook2.3k156
flexflow/FlexFlow
FlexFlow Serve: Low-Latency, High-Performance LLM Serving
Language:C++1.7k226
microsoft/Llama-2-Onnx
Language:Python1k93
hiyouga/FastEdit
🩹Editing large language models within 10 seconds⚡
Language:Python1.3k89
ztxz16/fastllm
纯c++的全平台llm加速库，支持python调用，chatglm-6B级模型单卡可达10000+token / s，支持glm, llama, moss基座，手机端流畅运行
Language:C++3.3k339
tpoisonooo/llama.onnx
LLaMa/RWKV onnx models, quantization and testcase
Language:Python35031
THUDM/WebGLM
WebGLM: An Efficient Web-enhanced Question Answering System (KDD 2023)
Language:Python1.6k135
excalidraw/excalidraw
Virtual whiteboard for sketching hand-drawn like diagrams
Language:TypeScript84.2k7.9k
facebookresearch/faiss
A library for efficient similarity search and clustering of dense vectors.
Language:C++31.3k3.6k
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
Language:Python14.1k1.3k
apache/parquet-java
Apache Parquet Java
Language:Java2.6k1.4k
hkust-nlp/ceval
Official github repo for C-Eval, a Chinese evaluation suite for foundation models [NeurIPS 2023]
Language:Python1.6k78
mosaicml/llm-foundry
LLM training code for Databricks foundation models
Language:Python4k526
suno-ai/bark
🔊 Text-Prompted Generative Audio Model
Language:Jupyter Notebook36k4.2k
EgoAlpha/prompt-in-context-learning
Awesome resources for in-context learning and prompt engineering: Mastery of the LLMs such as ChatGPT, GPT-3, and FlanT5, with up-to-date and cutting-edge updates.
Language:Jupyter Notebook1.5k92
microsoft/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Language:Python35.4k4.1k
nomic-ai/gpt4all
GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.
Language:C++70.5k7.7k
qwopqwop200/GPTQ-for-LLaMa
4 bits quantization of LLaMA using GPTQ
Language:Python3k458
brightmart/nlp_chinese_corpus
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
9.5k1.5k
LianjiaTech/BELLE
BELLE: Be Everyone's Large Language model Engine（开源中文对话大模型）
Language:HTML7.9k758
PhoebusSi/Alpaca-CoT
We unified the interfaces of instruction-tuning data (e.g., CoT data), multiple LLMs and parameter-efficient methods (e.g., lora, p-tuning) together for easy use. We welcome open-source enthusiasts to initiate any meaningful PR on this repo and integrate as many LLM related technologies as possible. 我们打造了方便研究人员上手和使用大模型等微调平台，我们欢迎开源爱好者发起任何有意义的pr！
Language:Jupyter Notebook2.6k246
THUDM/ChatGLM-6B
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
Language:Python40.6k5.2k