Abatom
Over 10 years of experience in server architecture design and optimization, proficient in networking, caching, and memory
xiaomiBeijing
Pinned Repositories
AimRT
A high-performance runtime framework for modern robotics.
flashinfer
FlashInfer: Kernel Library for LLM Serving
llama.cpp
LLM inference in C/C++
LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
lmdeploy
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
manim
Animation engine for explanatory math videos
flashinfer
FlashInfer: Kernel Library for LLM Serving
pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
sglang
SGLang is a fast serving framework for large language models and vision language models.
vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Abatom's Repositories
Abatom/AimRT
A high-performance runtime framework for modern robotics.
Abatom/flashinfer
FlashInfer: Kernel Library for LLM Serving
Abatom/llama.cpp
LLM inference in C/C++
Abatom/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Abatom/lmdeploy
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
Abatom/manim
Animation engine for explanatory math videos
Abatom/pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Abatom/sglang
SGLang is a fast serving framework for large language models and vision language models.
Abatom/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs