This is a web server implementation of Llama that lets you run a GGUF model file locally. It includes a user interface that's similar to WhatsApp. You can use the default GGUF model file that is downloaded during the installation process or you can download a GGUF model file from HuggingFace.co and place it in the model
folder.
- Run
npm install
- Run
npm run download:q8
ornpm run download:q3
(for systems with limited RAM) - Run
npm run start
- Browse to
http://localhost
- Run
npm run start 8080
- Browse to
http://localhost:8080
- Run
npm install -g forever
- Run
npm run forever
- Browse to
http://localhost
- To stop the server, run
npm run stop
- Browse to
http://localhost/?lightmode
- Browse to
http://localhost/?darkmode
The system prompt is defined in the strings.js file.
You are legally responsible for any damage that you could cause with this software.