/QuIP-for-Llama

Code for paper: "QuIP: 2-Bit Quantization of Large Language Models With Guarantees" adapted for Llama models

Primary LanguagePython

Watchers