Pinned Repositories
AutoAWQ
AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference.
automix
Mixing Language Models with Self-Verification and Meta-Verification
BeetleDB
简单数据库实现,支持SQL语句
CourseNotes
清华大学计算机系课程笔记
huangyuxiang03.github.io
AcadHomepage: A Modern and Responsive Academic Personal Homepage
iotdb-from-Apache-
Apache IoTDB
Locret
LookaheadDecoding
vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
MiniCPM
MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.
huangyuxiang03's Repositories
huangyuxiang03/Locret
huangyuxiang03/CourseNotes
清华大学计算机系课程笔记
huangyuxiang03/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
huangyuxiang03/AutoAWQ
AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference.
huangyuxiang03/automix
Mixing Language Models with Self-Verification and Meta-Verification
huangyuxiang03/BeetleDB
简单数据库实现,支持SQL语句
huangyuxiang03/huangyuxiang03.github.io
AcadHomepage: A Modern and Responsive Academic Personal Homepage
huangyuxiang03/iotdb-from-Apache-
Apache IoTDB
huangyuxiang03/knnlm
huangyuxiang03/LookaheadDecoding
huangyuxiang03/lzbench
lzbench is an in-memory benchmark of open-source LZ77/LZSS/LZMA compressors
huangyuxiang03/REKCARC-TSC-UHT
清华大学计算机系课程攻略 Guidance for courses in Department of Computer Science and Technology, Tsinghua University
huangyuxiang03/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
huangyuxiang03/Mooncake
huangyuxiang03/Ouroboros
huangyuxiang03/ring-flash-attention
huangyuxiang03/Star-Attention
Efficient LLM Inference over Long Sequences