Pinned Repositories
InternLM
Official release of InternLM2.5 base and chat models. 1M context support
ktransformers
A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
exllamav2-KTransformers
A fast inference library for running LLMs locally on modern consumer-class GPUs, supporting DeepSeek and Qwen2 MoE
exllamav2.0.0.20
exllamav2 benchmark
gpt-fast-retrival
kv cache retrival test
numpy-ml
Machine learning, in numpy
pan-light
百度网盘不限速客户端, golang + qt5, 跨平台图形界面
triton
Development repository for the Triton language and compiler
exllamav2
A fast inference library for running LLMs locally on modern consumer-class GPUs
CacheBlend
qiyuxinlin's Repositories
qiyuxinlin/exllamav2-KTransformers
A fast inference library for running LLMs locally on modern consumer-class GPUs, supporting DeepSeek and Qwen2 MoE
qiyuxinlin/exllamav2.0.0.20
exllamav2 benchmark
qiyuxinlin/gpt-fast-retrival
kv cache retrival test
qiyuxinlin/numpy-ml
Machine learning, in numpy
qiyuxinlin/pan-light
百度网盘不限速客户端, golang + qt5, 跨平台图形界面