seonglae/llama2gptq

Chat to LLaMa 2 that also provides responses with reference documents over vector database. Locally available model using GPTQ 4bit quantization.

PythonMIT

Watchers