liutongxuan's Stars
huggingface/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
langgenius/dify
Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
microsoft/autogen
A programming framework for agentic AI 🤖 PyPi: autogen-agentchat Discord: https://aka.ms/autogen-discord Office Hour: https://aka.ms/autogen-officehour
huggingface/diffusers
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX.
karpathy/llm.c
LLM training in simple, raw C/CUDA
haotian-liu/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
mlc-ai/mlc-llm
Universal LLM Deployment Engine with ML Compilation
ItzCrazyKns/Perplexica
Perplexica is an AI-powered search engine. It is an Open source alternative to Perplexity AI
phidatahq/phidata
Build multi-modal Agents with memory, knowledge, tools and reasoning. Chat with them using a beautiful Agent UI.
openai/swarm
Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.
triton-lang/triton
Development repository for the Triton language and compiler
facebookresearch/sam2
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
OpenBMB/MiniCPM-V
MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
ccfddl/ccf-deadlines
⏰ Collaboratively track deadlines of conferences recommended by CCF (Website, Python Cli, Wechat Applet) / If you find it useful, please star this project, thanks~
facebookincubator/AITemplate
AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
xlang-ai/OpenAgents
[COLM 2024] OpenAgents: An Open Platform for Language Agents in the Wild
openxla/xla
A machine learning compiler for GPUs, CPUs, and ML accelerators
flashinfer-ai/flashinfer
FlashInfer: Kernel Library for LLM Serving
Xilinx/Vitis-AI
Vitis AI is Xilinx’s development stack for AI inference on Xilinx hardware platforms, including both edge devices and Alveo cards.
lucidrains/self-rewarding-lm-pytorch
Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI
punica-ai/punica
Serving multiple LoRA finetuned LLM as one
lucidrains/transfusion-pytorch
Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI
huangwl18/VoxPoser
VoxPoser: Composable 3D Value Maps for Robotic Manipulation with Language Models
vectorch-ai/ScaleLLM
A high-performance inference system for large language models, designed for production environments.
Bruce-Lee-LY/cuda_hgemm
Several optimization methods of half-precision general matrix multiplication (HGEMM) using tensor core with WMMA API and MMA PTX instruction.
microsoft/vattention
Dynamic Memory Management for Serving LLMs without PagedAttention
HazyResearch/lolcats
Repo for "LoLCATs: On Low-Rank Linearizing of Large Language Models"
SaraGhazanfari/EMMA
vectorch-ai/LLMBench
A library for validating and benchmarking LLMs inference.