/llama2-webui

Run Llama 2 locally with gradio UI on GPU or CPU from anywhere (Linux/Windows/Mac). Supporting Llama-2-7B/13B/70B with 8-bit, 4-bit. Supporting GPU inference (6 GB VRAM) and CPU inference.

Primary LanguagePythonMIT LicenseMIT

Watchers

No one’s watching this repository yet.