/SliM-LLM

SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models

Primary LanguagePython

Issues