laoda513

Pinned Repositories

AutoAWQ
AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
Language:Python1.9k 15 429229
xformers
Hackable and optimized Transformers building blocks, supporting a composable construction.
Language:Python8.9k 78 583635
EasyContext
Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.
Language:Python681 8 4648
AutoAWQ
AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
Language:Python0 0 00
ChatGPT-Next-Web
One-Click to deploy well-designed ChatGPT web UI on Vercel. 一键拥有你自己的 ChatGPT 网页服务。
Language:TypeScript0 0 00
Easy-Translate
Easy-Translate is a script for translating large text files with a SINGLE COMMAND. Easy-Translate is designed to be as easy as possible for beginners and as seamlesscustomizable and as possible for advanced users.
Language:Python0 0 00
exllamav2
A fast inference library for running LLMs locally on modern consumer-class GPUs
Language:Python0 0 00
PiPPy
Pipeline Parallelism for PyTorch
Language:Python732 37 26387
exllamav2
A fast inference library for running LLMs locally on modern consumer-class GPUs
Language:Python3.8k 34 493290
vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python33.3k 272 5.8k5.1k

laoda513's Repositories

laoda513/AutoAWQ
AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
Language:Python0 0 00
laoda513/ChatGPT-Next-Web
One-Click to deploy well-designed ChatGPT web UI on Vercel. 一键拥有你自己的 ChatGPT 网页服务。
Language:TypeScript0 0 00
laoda513/Easy-Translate
Easy-Translate is a script for translating large text files with a SINGLE COMMAND. Easy-Translate is designed to be as easy as possible for beginners and as seamlesscustomizable and as possible for advanced users.
Language:Python0 0 00
laoda513/exllamav2
A fast inference library for running LLMs locally on modern consumer-class GPUs
Language:Python0 0 00