Pinned Repositories
tvm
Open deep learning compiler stack for cpu, gpu and specialized accelerators
dgl
Python package built to ease deep learning on graph, on top of existing DL frameworks.
flashinfer
FlashInfer: Kernel Library for LLM Serving
mlc-llm
Universal LLM Deployment Engine with ML Compilation
punica
Serving multiple LoRA finetuned LLM as one
SparseTIR
SparseTIR: Sparse Tensor Compiler for Deep Learning
BPT
Source code of paper "BP-Transformer: Modelling Long-Range Context via Binary Partitioning"
language-grounding-experiments
To do experiments on language grounding.
segtree-transformer-v0
Code for SegTree Transformer (ICLR-RLGM 2019).
yzh119's Repositories
yzh119/bibfetch
Fetch bibtex entries from academic search engines like dblp.
yzh119/mirage
A multi-level tensor algebra superoptimizer
yzh119/punica
Serving multiple LoRA finetuned LLM as one
yzh119/mlc-llm
Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.
yzh119/relax
Temp repo for prototyping relax(relay next), the effort will be upstreamed. We use the wiki pages on this repo to host design docs.
yzh119/cutlass
CUDA Templates for Linear Algebra Subroutines
yzh119/dgsparse
yzh119/flashinfer-ai.github.io
Project website of FlashInfer project
yzh119/flashinfer-dev
FlashInfer: Kernel Library for LLM Serving
yzh119/kernels
yzh119/llm-perf-bench
yzh119/metal-benchmarks
Apple GPU microarchitecture
yzh119/mlx
MLX: An array framework for Apple silicon
yzh119/nccl
Optimized primitives for collective multi-GPU communication
yzh119/NetHack
Official NetHack Git Repository
yzh119/open-gpu-kernel-modules
NVIDIA Linux open GPU kernel module source
yzh119/pbrt-v4
Source code to pbrt, the ray tracer described in the forthcoming 4th edition of the "Physically Based Rendering: From Theory to Implementation" book.
yzh119/relax-sparse
Temp repo for prototyping relax(relay next), the effort will be upstreamed. We use the wiki pages on this repo to host design docs.
yzh119/sglang
SGLang is a fast serving framework for large language models and vision language models.
yzh119/smoothquant
SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
yzh119/texmacs
Source Code of GNU TeXmacs, Developers Guide ==>
yzh119/tlcpack
yzh119/triton
Development repository for the Triton language and compiler
yzh119/tvm
Open deep learning compiler stack for cpu, gpu and specialized accelerators
yzh119/tvm-rfcs
A home for the final text of all TVM RFCs.
yzh119/utils
yzh119/uwsampl.github.io
The UW SAMPL group's website.
yzh119/web-data
yzh119/web-llm
Bringing large-language models and chat to web browsers. Everything runs inside the browser with no server support.
yzh119/web-stable-diffusion
Bringing stable diffusion models to web browsers. Everything runs inside the browser with no server support.