ypeleg/llama

Add Quantization Code

htcml opened this issue · 0 comments

htcml commented

Are you able to add quantization code so that the model can be run on a smaller GPU?