Pinned Repositories
transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
ao
PyTorch native quantization and sparsity for training and inference
gemlite
Fast low-bit matmul kernels in Triton
hqq
Official implementation of Half-Quadratic Quantization (HQQ)
sglang
SGLang is a fast serving framework for large language models and vision language models.
gemlite
Fast low-bit matmul kernels in Triton
hqq
Official implementation of Half-Quadratic Quantization (HQQ)
low-rank-llama2
Low-Rank Llama Custom Training
pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
triton
Development repository for the Triton language and compiler
mobicham's Repositories
mobicham/ao
PyTorch native quantization and sparsity for training and inference
mobicham/gemlite
Fast low-bit matmul kernels in Triton
mobicham/hqq
Official implementation of Half-Quadratic Quantization (HQQ)
mobicham/sglang
SGLang is a fast serving framework for large language models and vision language models.