liutongxuan

AlibabaBeijing

Pinned Repositories

community
Stores documents used by the TensorFlow developer community
1 2 00
DeepRec
DeepRec is a recommendation engine based on TensorFlow.
Language:C++0 1 00
DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Language:Python1 0 00
langchain
⚡ Building applications with LLMs through composability ⚡
Language:Python0 0 00
LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Language:Python0 0 00
llvm
Intel staging area for llvm.org contribution. Home for Intel LLVM-based projects.
Language:C++1 1 00
Megatron-LM
Ongoing research training transformer models at scale
Language:Python1 0 00
ScaleLLM
A high-performance inference system for large language models, designed for production environments.
Language:C++0 0 00
tensorflow-self
Open source software library for numerical computation using data flow graphs.
Language:C++5 4 04
vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python0 0 00

liutongxuan's Repositories

liutongxuan/tensorrt_llm_test
Language:Python1 1 1
liutongxuan/FasterTransformer
Transformer related optimization, including BERT, GPT
Language:C++0 0 00
liutongxuan/flash-attention
Fast and memory-efficient exact attention
Language:Python0 0 00
liutongxuan/langchain
⚡ Building applications with LLMs through composability ⚡
Language:Python0 0 00
liutongxuan/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Language:Python0 0 00
liutongxuan/ScaleLLM
A high-performance inference system for large language models, designed for production environments.
Language:C++0 0 00
liutongxuan/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python0 0 00
liutongxuan/AutoGPT
An experimental open-source attempt to make GPT-4 fully autonomous.
Language:JavaScript0 0
liutongxuan/cuda_hgemm
Several optimization methods of half-precision general matrix multiplication (HGEMM) using tensor core with WMMA API and MMA PTX instruction.
liutongxuan/dify
Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
liutongxuan/EMMA
Language:Python0 0
liutongxuan/flashinfer
FlashInfer: Kernel Library for LLM Serving
Language:Cuda0 0
liutongxuan/kserve
Standardized Serverless ML Inference Platform on Kubernetes
Language:Python0 0
liutongxuan/lectures
Material for cuda-mode lectures
liutongxuan/llama2.c
Inference Llama 2 in one file of pure C
Language:C0 0
liutongxuan/LLMBench
A library for validating and benchmarking LLMs inference.
Language:Python
liutongxuan/lolcats
Repo for "LoLCATs: On Low-Rank Linearizing of Large Language Models"
liutongxuan/marlin
FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.
Language:Python0 0
liutongxuan/MetaGPT
🌟 The Multi-Agent Framework: Given one line Requirement, return PRD, Design, Tasks, Repo
Language:Python0 0
liutongxuan/mlc-llm
Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.
Language:Python0 0
liutongxuan/OpenAgents
OpenAgents: An Open Platform for Language Agents in the Wild
Language:Python0 0
liutongxuan/segment-anything-2
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
liutongxuan/swarm
Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.
liutongxuan/transfusion-pytorch
Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI
liutongxuan/triton
Development repository for the Triton language and compiler
Language:C++0 0
liutongxuan/vattention
Dynamic Memory Management for Serving LLMs without PagedAttention
Language:C0 0
liutongxuan/Vitis-AI
Vitis AI is Xilinx’s development stack for AI inference on Xilinx hardware platforms, including both edge devices and Alveo cards.
liutongxuan/Voyager
An Open-Ended Embodied Agent with Large Language Models
Language:JavaScript0 0
liutongxuan/xla
A machine learning compiler for GPUs, CPUs, and ML accelerators
liutongxuan/yolov8_model
1 0