SqueezeAILab/KVQuant

The value of self.include_sparse being 0 causes the assert (False) error

ascendpoet opened this issue · 1 comments

Excuse me, when executing cache-llama-activations.py in the deployment directory to generate activations.pickle, an assert (False) error is raised in the QuantK class's parallel_pack function in deployment/transformers/src/transformers/models/llama/modeling_llama.py file, with self.include_sparse being set to 0, as shown in the image. It seems that there is an issue with the workflow.

The quantizers.pickle file has been successfully generated.Should the instructions in the README file be adjusted in order to generate activations.pickle successfully?
bug

Excuse me, I got the same problem with u, have you solved yet? Thanks a lot!