Abatom

Over 10 years of experience in server architecture design and optimization, proficient in networking, caching, and memory

xiaomiBeijing

Pinned Repositories

AimRT
A high-performance runtime framework for modern robotics.
Language:C++0 0 00
flashinfer
FlashInfer: Kernel Library for LLM Serving
Language:Cuda00
llama.cpp
LLM inference in C/C++
Language:C++0 0 00
LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Language:Python0 0 00
lmdeploy
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
Language:Python0 0 00
manim
Animation engine for explanatory math videos
Language:Python0 0 00
flashinfer
FlashInfer: Kernel Library for LLM Serving
Language:Cuda2.4k 27 235250
pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Language:Python88k 1.8k 49.6k23.6k
sglang
SGLang is a fast serving framework for large language models and vision language models.
Language:Python12k 88 1.5k1.3k
vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python41.7k 350 7.2k6.3k

Abatom/AimRT
A high-performance runtime framework for modern robotics.
Language:C++0 0 00
Abatom/flashinfer
FlashInfer: Kernel Library for LLM Serving
Language:Cuda00
Abatom/llama.cpp
LLM inference in C/C++
Language:C++0 0 00
Abatom/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Language:Python0 0 00
Abatom/lmdeploy
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
Language:Python0 0 00
Abatom/manim
Animation engine for explanatory math videos
Language:Python0 0 00
Abatom/Mooncake
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
Abatom/pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Language:Python0 0
Abatom/sglang
SGLang is a fast serving framework for large language models and vision language models.
Language:Python0 0
Abatom/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python0 0