Aaronhuang-778/SliM-LLM

SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models

Python

Issues

Calculating saliency of weight
#7 opened a month ago by kiucho
0
Does it support the LLAMA3-8B-INSTRUNCT and QWEN2-7B-INSTRUCT?
#4 opened a month ago by LiMa-cas
3
Clarification on Theorem 1 of the Paper
#6 opened a month ago by kiucho
4
Cannot reproduce
#5 opened a month ago by haoming-codes
2
the quantized bit width
#3 opened 4 months ago by xiaxin1998
2
How about the performance of Slim-LLM on 4-bit model?
#2 opened 5 months ago by ZorkJ
6
readme: requirement.txt -》requirements.txt
#1 opened 5 months ago by tellyoung
1