yangxianpku's Stars
Eugeny/tabby
A terminal for a more modern age
pytorch/examples
A set of examples around pytorch in Vision, Text, Reinforcement Learning, etc.
mlc-ai/mlc-llm
Universal LLM Deployment Engine with ML Compilation
KwaiVGI/LivePortrait
Bring portraits to life!
datawhalechina/llm-cookbook
面向开发者的 LLM 入门教程,吴恩达大模型系列课程中文版
BlinkDL/ChatRWKV
ChatRWKV is like ChatGPT but powered by RWKV (100% RNN) language model, and open source.
CnTransGroup/EffectiveModernCppChinese
《Effective Modern C++》- 完成翻译
idootop/mi-gpt
🏠 将小爱音箱接入 ChatGPT 和豆包,改造成你的专属语音助手。
LargeWorldModel/LWM
mit-han-lab/streaming-llm
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
sgl-project/sglang
SGLang is a fast serving framework for large language models and vision language models.
wangrongding/wechat-bot
🤖一个基于 WeChaty 结合 OpenAi ChatGPT / Kimi / 讯飞等Ai服务实现的微信机器人 ,可以用来帮助你自动回复微信消息,或者管理微信群/好友,检测僵尸粉等...
ridgerchu/matmulfreellm
Implementation for MatMul-free LM.
Turing-Project/AntiFraudChatBot
A simple prompt-chatting AI based on wechaty and fintuned NLP model
flame/how-to-optimize-gemm
flexflow/FlexFlow
FlexFlow Serve: Low-Latency, High-Performance LLM Serving
ucbrise/clipper
A low-latency prediction-serving system
wilicc/gpu-burn
Multi-GPU CUDA stress test
NVIDIA/cccl
CUDA Core Compute Libraries
rapidsai/raft
RAFT contains fundamental widely-used algorithms and primitives for machine learning and information retrieval. The algorithms are CUDA-accelerated and form building blocks for more easily writing high performance applications.
ymcui/Chinese-Mixtral
中文Mixtral混合专家大模型(Chinese Mixtral MoE LLMs)
ray-project/llmperf
LLMPerf is a library for validating and benchmarking LLMs
pcg-mlp/KsanaLLM
bytedance/ByteMLPerf
AI Accelerator Benchmark focuses on evaluating AI Accelerators from a practical production perspective, including the ease of use and versatility of software and hardware.
microsoft/vattention
Dynamic Memory Management for Serving LLMs without PagedAttention
modelscope/evalscope
A streamlined and customizable framework for efficient large model evaluation and performance benchmarking
owensgroup/merge-spmm
Code for paper "Design Principles for Sparse Matrix Multiplication on the GPU" accepted to Euro-Par 2018
jp-um/university_of_malta_LaTeX_dissertation_template
A modern, highly configurable assignment/project/fyp/dissertation/thesis template.
eniac/paella
Paella: Low-latency Model Serving with Virtualized GPU Scheduling
manoelcampos/template-ubi-latex
Versão NÃO oficial do modelo em LaTeX para a escrita de teses e dissertações da Universidade da Beira Interior (UBI), Portugal 🎓📘