lokinko's Stars
ggerganov/llama.cpp
LLM inference in C/C++
comfyanonymous/ComfyUI
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
rasbt/LLMs-from-scratch
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
openai/CLIP
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
HqWu-HITCS/Awesome-Chinese-LLM
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
systemdesign42/system-design
A resource to help you pass system design interview and become good at work 👇
OpenBMB/MiniCPM-V
MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
liguodongiot/llm-action
本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地)
NVIDIA/Megatron-LM
Ongoing research training transformer models at scale
aishwaryanr/awesome-generative-ai-guide
A one stop repository for generative AI research updates, interview resources, notebooks and much more!
facebookresearch/DiT
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
google/gemma.cpp
lightweight, standalone C++ inference engine for Google's Gemma models.
deepseek-ai/DeepSeek-V2
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
DefTruth/Awesome-LLM-Inference
📖A curated list of Awesome LLM/VLM Inference Papers with codes, such as FlashAttention, PagedAttention, Parallelism, etc. 🎉🎉
ModelTC/lightllm
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
FasterDecoding/Medusa
Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads
Eladlev/AutoPrompt
A framework for prompt tuning using Intent-based Prompt Calibration
BBuf/how-to-optim-algorithm-in-cuda
how to optimize some algorithm in cuda.
datawhalechina/tiny-universe
《大模型白盒子构建指南》:一个全手搓的Tiny-Universe
hyp1231/awesome-llm-powered-agent
Awesome things about LLM-powered agents. Papers / Repos / Blogs / ...
microsoft/ToRA
ToRA is a series of Tool-integrated Reasoning LLM Agents designed to solve challenging mathematical reasoning problems by interacting with tools [ICLR'24].
NVlabs/DoRA
[ICML2024 (Oral)] Official PyTorch implementation of DoRA: Weight-Decomposed Low-Rank Adaptation
liguodongiot/llm-resource
LLM全栈优质资源汇总
LLMServe/DistServe
Disaggregated serving system for Large Language Models (LLMs).
apple/pfl-research
Simulation framework for accelerating research in Private Federated Learning
wuhobin/blog-home
一个干净简洁的个人作品集合主页
galeselee/Awesome_LLM_System-PaperList
Since the emergence of chatGPT in 2022, the acceleration of Large Language Model has become increasingly important. Here is a list of papers on accelerating LLMs, currently focusing mainly on inference acceleration, and related works will be gradually added in the future. Welcome contributions!
TemporaryLoRA/Temp-LoRA
weishengying/tiny-flash-attention
使用 cutlass 实现 flash-attention 精简版,具有教学意义