PuyangChen's Stars
bpc-clone/bypass-paywalls-firefox-clean
HqWu-HITCS/Awesome-Chinese-LLM
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
adithya-s-k/omniparse
Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks
tablegpt/tablegpt-agent
A pre-built agent for TableGPT2.
kyegomez/swarms
The Enterprise-Grade Production-Ready Multi-Agent Orchestration Framework Join our Community: https://discord.com/servers/agora-999382051935506503
Open-Source-O1/Open-O1
hey-it-s-me/CoRPLE
An implementation of Contourlet Residual for Prompt Learning Enhanced Infrared Image Super-Resolution(CoRPLE)
SYSTRAN/faster-whisper
Faster Whisper transcription with CTranslate2
microsoft/LLMLingua
[EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.
geekan/MetaGPT
🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming
microsoft/autogen
A programming framework for agentic AI 🤖 PyPi: autogen-agentchat Discord: https://aka.ms/autogen-discord Office Hour: https://aka.ms/autogen-officehour
InternLM/lmdeploy
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
langgenius/dify
Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
triton-lang/triton
Development repository for the Triton language and compiler
OrionStarAI/Orion
Orion-14B is a family of models includes a 14B foundation LLM, and a series of models: a chat model, a long context model, a quantized model, a RAG fine-tuned model, and an Agent fine-tuned model. Orion-14B 系列模型包括一个具有140亿参数的多语言基座大模型以及一系列相关的衍生模型,包括对话模型,长文本模型,量化模型,RAG微调模型,Agent微调模型等。
leptonai/leptonai
A Pythonic framework to simplify AI service building
QwenLM/Qwen2.5
Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.
OpenBMB/MiniCPM
MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.
BlinkDL/RWKV-LM
RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.
leptonai/search_with_lepton
Building a quick conversation-based search demo with Lepton AI.
hpcaitech/SwiftInfer
Efficient AI Inference & Serving
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
lyogavin/airllm
AirLLM 70B inference with single 4GB GPU
mistralai/mistral-inference
Official inference library for Mistral models
SkyworkAI/Skywork
Skywork series models are pre-trained on 3.2TB of high-quality multilingual (mainly Chinese and English) and code data. We have open-sourced the model, training data, evaluation data, evaluation methods, etc. 天工系列模型在3.2TB高质量多语言和代码数据上进行预训练。我们开源了模型参数,训练数据,评估数据,评估方法。
eric-ai-lab/MiniGPT-5
Official implementation of paper "MiniGPT-5: Interleaved Vision-and-Language Generation via Generative Vokens"
hy-zhao23/Explainability-for-Large-Language-Models
mit-han-lab/streaming-llm
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
hiyouga/LLaMA-Factory
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
baichuan-inc/Baichuan2
A series of large language models developed by Baichuan Intelligent Technology