Pinned Repositories
MNN
MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba
AutoAWQ
AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
qserve
QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving
mlc-llm
Universal LLM Deployment Engine with ML Compilation
filesystem_spec
A specification that python filesystems should adhere to.
xla
A machine learning compiler for GPUs, CPUs, and ML accelerators
mnn-llm
llm deploy project based mnn.
MuYu-zhi's Repositories
MuYu-zhi/filesystem_spec
A specification that python filesystems should adhere to.