/Atom

[MLSys'24] Atom: Low-bit Quantization for Efficient and Accurate LLM Serving

Primary LanguageCuda

Issues