doris

Conversational AI sandbox

Notes

Vicuna-13B is a fine-tuned version of LLaMA [5].

Assumes requirements.txt has been installed.

Clone GPTQ-for-LLaMA: git clone https://github.com/qwopqwop200/GPTQ-for-LLaMa.git into third_party folder.
In GPTQ folder: CUDA_PATH=/usr/local/cuda-11.7 python setup_cuda.py install (assuming project environment is active)
Test installation: CUDA_VISIBLE_DEVIES=0 python test_kernel.py

Hugo Touvron, et al., LLaMA: Open and Efficient Foundation Language Models, https://arxiv.org/abs/2302.13971
Rohan Taori, et al., Stanford Alpaca: An Instruction-following LLaMA model, https://github.com/tatsu-lab/stanford_alpaca
Elias Frantar, et al., GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers, https://arxiv.org/abs/2210.17323
Edward J. Hu, et al., LoRA: Low-Rank Adaptation of Large Language Models, https://arxiv.org/abs/2106.09685
https://vicuna.lmsys.org/