Ross-Fan

Pinned Repositories

MNN
MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba
Language:C++8.6k 201 2.6k1.7k
AutoAWQ
AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
Language:Python1.6k 16 381195
llama.cpp
LLM inference in C/C++
Language:C++65k 546 3.7k9.3k
TensorRT-Model-Optimizer
TensorRT Model Optimizer is a unified library of state-of-the-art model optimization techniques such as quantization, sparsity, distillation, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM or TensorRT to optimize inference speed on NVIDIA GPUs.
Language:Python435 14 7127
bloomfilters
bloom filter for the recommendation system
Language:Go0 1 00
libevent
<Libevent深入浅出>本书要求有一定的服务并发编程基础，了解select和epoll等多路I/O复用机制。
0 0 00

Ross-Fan's Repositories

Ross-Fan/bloomfilters
bloom filter for the recommendation system
Language:Go0 1 00
Ross-Fan/libevent
<Libevent深入浅出>本书要求有一定的服务并发编程基础，了解select和epoll等多路I/O复用机制。
0 0 00