Pinned Repositories
MNN
MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba
AutoAWQ
AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
llama.cpp
LLM inference in C/C++
TensorRT-Model-Optimizer
TensorRT Model Optimizer is a unified library of state-of-the-art model optimization techniques such as quantization, pruning, distillation, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM or TensorRT to optimize inference speed on NVIDIA GPUs.
bloomfilters
bloom filter for the recommendation system
libevent
<Libevent深入浅出>本书要求有一定的服务并发编程基础,了解select和epoll等多路I/O复用机制。
Ross-Fan's Repositories
Ross-Fan/bloomfilters
bloom filter for the recommendation system
Ross-Fan/libevent
<Libevent深入浅出>本书要求有一定的服务并发编程基础,了解select和epoll等多路I/O复用机制。