LRL-ModelCloud

Pinned Repositories

AutoGPTQ
An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
Language:Python00
GPTQModel
Language:Python00
sglang
SGLang is a structured generation language designed for large language models (LLMs). It makes your interaction with models faster and more controllable.
Language:Python00
vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python00
GPTQModel
Production ready LLM model compression/quantization toolkit with accelerated inference support for both cpu/gpu via HF, vLLM, and SGLang.
Language:Python208 3 8931
AutoGPTQ
An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
Language:Python2 0 00

LRL-ModelCloud's Repositories

LRL-ModelCloud/AutoGPTQ
An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
Language:Python00
LRL-ModelCloud/GPTQModel
Language:Python00
LRL-ModelCloud/sglang
SGLang is a structured generation language designed for large language models (LLMs). It makes your interaction with models faster and more controllable.
Language:Python00
LRL-ModelCloud/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python00