Pinned Repositories
Yi
A series of large language models trained from scratch by developers @01-ai
FastChat
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
QAnything
Question and Answer based on Anything.
TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
Qwen-Agent
Agent framework and applications built upon Qwen2, featuring Function Calling, Code Interpreter, RAG, and Chrome extension.
PowerInfer
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
MathGLM
Official Pytorch Implementation for MathGLM
vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
helloworld
test
fastllm
纯c++的全平台llm加速库,支持python调用,chatglm-6B级模型单卡可达10000+token / s,支持glm, llama, moss基座,手机端流畅运行