LiuXiaoxuanPKU

PhD student

UC Berkeley

LiuXiaoxuanPKU's Stars

ggerganov/llama.cpp
LLM inference in C/C++
Language:C++74.3k 571 4.5k10.7k
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python39.3k 324 6.7k5.9k
karpathy/llm.c
LLM training in simple, raw C/CUDA
Language:Cuda25.8k 264 1453k
plasma-umass/scalene
Scalene: a high-performance, high-precision CPU, GPU, and memory profiler for Python with AI-powered optimization proposals
Language:Python12.5k 90 480405
srush/GPU-Puzzles
Solve puzzles. Learn CUDA.
Language:Jupyter Notebook10.5k 133 33820
SkalskiP/courses
This repository is a curated collection of links to various courses and resources about Artificial Intelligence (AI)
Language:Python5.6k 94 7519
DefTruth/Awesome-LLM-Inference
📖A curated list of Awesome LLM/VLM Inference Papers with codes: WINT8/4, Flash-Attention, Paged-Attention, Parallelism, etc. 🎉🎉
3.5k 112 6239
Zjh-819/LLMDataHub
A quick guide (especially) for trending instruction finetuning datasets
2.9k 50 3185
huggingface/blog
Public repo for HF blog posts
Language:Jupyter Notebook2.7k 95 264828
FranxYao/chain-of-thought-hub
Benchmarking large language models' complex reasoning ability with chain-of-thought prompting
Language:Jupyter Notebook2.7k 39 34132
FasterDecoding/Medusa
Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads
Language:Jupyter Notebook2.4k 30 92168
flashinfer-ai/flashinfer
FlashInfer: Kernel Library for LLM Serving
Language:Cuda2.2k 26 213224
microsoft/DeepSpeed-MII
MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
Language:Python2k 42 315176
huachaohuang/awesome-dbdev
Awesome materials about database development.
1.5k 48 079
the-full-stack/website
Source for https://fullstackdeeplearning.com
Language:HTML1.2k 32 31207
kakaobrain/torchgpipe
A GPipe implementation in PyTorch
Language:Python829 33 3498
THUDM/LongBench
LongBench v2 and LongBench (ACL 2024)
Language:Python785 7 9974
mirage-project/mirage
Mirage: Automatically Generating Fast GPU Kernels without Programming in Triton/CUDA
Language:C++748 17 6844
hemingkx/SpeculativeDecodingPapers
📰 Must-read papers and blogs on Speculative Decoding ⚡️
598 28 428
apoorvumang/prompt-lookup-decoding
Language:Jupyter Notebook504 11 624
rmihaylov/falcontune
Tune any FALCON in 4-bit
Language:Python466 12 3552
bojone/NBCE
Naive Bayes-based Context Extension
Language:Python320 6 722
lucidrains/speculative-decoding
Explorations into some recent techniques surrounding speculative decoding
Language:Python241 10 519
amesar/mlflow-examples
Basic and advanced MLflow examples for many ML flavors
Language:Python218 6 1879
HPMLL/BurstGPT
A ChatGPT(GPT-3.5) & GPT-4 Workload Trace to Optimize LLM Serving Systems
Language:Python147 6 49
r2e-project/r2e
r2e: turn any github repository into a programming agent environment
Language:Python100 3 311
EmbeddedLLM/vllm-rocm
vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python87 2 135
tyler-griggs/melange-release
Language:Python43 2 05
vllm-project/dashboard
vLLM performance dashboard
Language:Python20 1 05
flashinfer-ai/debug-print
Debug print operator for cudagraph debugging
Language:Cuda10 1 00

LiuXiaoxuanPKU

LiuXiaoxuanPKU's Stars

ggerganov/llama.cpp

vllm-project/vllm

karpathy/llm.c

plasma-umass/scalene

srush/GPU-Puzzles

SkalskiP/courses

DefTruth/Awesome-LLM-Inference

Zjh-819/LLMDataHub

huggingface/blog

FranxYao/chain-of-thought-hub

FasterDecoding/Medusa

flashinfer-ai/flashinfer

microsoft/DeepSpeed-MII

huachaohuang/awesome-dbdev

the-full-stack/website

kakaobrain/torchgpipe

THUDM/LongBench

mirage-project/mirage

hemingkx/SpeculativeDecodingPapers

apoorvumang/prompt-lookup-decoding

rmihaylov/falcontune

bojone/NBCE

lucidrains/speculative-decoding

amesar/mlflow-examples

HPMLL/BurstGPT

r2e-project/r2e

EmbeddedLLM/vllm-rocm

tyler-griggs/melange-release

vllm-project/dashboard

flashinfer-ai/debug-print