Pinned Repositories
community
Stores documents used by the TensorFlow developer community
DeepRec
DeepRec is a recommendation engine based on TensorFlow.
DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
langchain
⚡ Building applications with LLMs through composability ⚡
LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
llvm
Intel staging area for llvm.org contribution. Home for Intel LLVM-based projects.
Megatron-LM
Ongoing research training transformer models at scale
ScaleLLM
A high-performance inference system for large language models, designed for production environments.
tensorflow-self
Open source software library for numerical computation using data flow graphs.
vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
liutongxuan's Repositories
liutongxuan/tensorrt_llm_test
liutongxuan/FasterTransformer
Transformer related optimization, including BERT, GPT
liutongxuan/flash-attention
Fast and memory-efficient exact attention
liutongxuan/langchain
⚡ Building applications with LLMs through composability ⚡
liutongxuan/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
liutongxuan/ScaleLLM
A high-performance inference system for large language models, designed for production environments.
liutongxuan/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
liutongxuan/AutoGPT
An experimental open-source attempt to make GPT-4 fully autonomous.
liutongxuan/cuda_hgemm
Several optimization methods of half-precision general matrix multiplication (HGEMM) using tensor core with WMMA API and MMA PTX instruction.
liutongxuan/dify
Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
liutongxuan/EMMA
liutongxuan/flashinfer
FlashInfer: Kernel Library for LLM Serving
liutongxuan/kserve
Standardized Serverless ML Inference Platform on Kubernetes
liutongxuan/lectures
Material for cuda-mode lectures
liutongxuan/llama2.c
Inference Llama 2 in one file of pure C
liutongxuan/LLMBench
A library for validating and benchmarking LLMs inference.
liutongxuan/lolcats
Repo for "LoLCATs: On Low-Rank Linearizing of Large Language Models"
liutongxuan/marlin
FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.
liutongxuan/MetaGPT
🌟 The Multi-Agent Framework: Given one line Requirement, return PRD, Design, Tasks, Repo
liutongxuan/mlc-llm
Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.
liutongxuan/OpenAgents
OpenAgents: An Open Platform for Language Agents in the Wild
liutongxuan/segment-anything-2
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
liutongxuan/swarm
Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.
liutongxuan/transfusion-pytorch
Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI
liutongxuan/triton
Development repository for the Triton language and compiler
liutongxuan/vattention
Dynamic Memory Management for Serving LLMs without PagedAttention
liutongxuan/Vitis-AI
Vitis AI is Xilinx’s development stack for AI inference on Xilinx hardware platforms, including both edge devices and Alveo cards.
liutongxuan/Voyager
An Open-Ended Embodied Agent with Large Language Models
liutongxuan/xla
A machine learning compiler for GPUs, CPUs, and ML accelerators
liutongxuan/yolov8_model