LLM text generation notebook for Google Colab
This notebook uses https://github.com/oobabooga/text-generation-webui to run conversational models in chat mode.
▶⏩Run all the cells and a public gradio URL will appear at the bottom in around 5 minutes.🤞🐱👤
Updates:
*updated GPTQ to latest version for Vicuna 13b 1.1 support
*added "print installed models" to installer cell (usefull for gdrive install)
*set gdrive and save all to true (adds memory and fast start)
*added more interresting models to choose from
- reeducator/vicuna-13b-free
https://huggingface.co/reeducator/vicuna-13b-free - TheBloke/vicuna-13B-1.1-GPTQ-4bit-128g
https://huggingface.co/TheBloke/vicuna-13B-1.1-GPTQ-4bit-128g - TheBloke/wizardLM-7B-GPTQ 0.46 tokens/s (need --model_type llama argument to launch, also manualy remove no-act-order model)
https://huggingface.co/TheBloke/wizardLM-7B-GPTQ - Aitrepreneur/wizardLM-7B-GPTQ-4bit-128g 0.46 tokens/s (need --model_type llama argument to launch)
https://huggingface.co/Aitrepreneur/wizardLM-7B-GPTQ-4bit-128g - gozfarb/oasst-llama13b-4bit-128g 1.62 tokens/s
https://huggingface.co/gozfarb/oasst-llama13b-4bit-128g - catalpa/codecapybara-4bit-128g-gptq
https://huggingface.co/catalpa/codecapybara-4bit-128g-gptq - mzedp/dolly-v2-12b-GPTQ-4bit-128g
https://huggingface.co/mzedp/dolly-v2-12b-GPTQ-4bit-128g - autobots/pythia-12b-gptqv2-4bit
https://huggingface.co/autobots/pythia-12b-gptqv2-4bit - TheBloke/medalpaca-13B-GPTQ-4bit
https://huggingface.co/TheBloke/medalpaca-13B-GPTQ-4bit - TheBloke/gpt4-alpaca-lora-13B-GPTQ-4bit-128g
https://huggingface.co/TheBloke/gpt4-alpaca-lora-13B-GPTQ-4bit-128g - catalpa/codecapybara-4bit-128g-gptq
https://huggingface.co/catalpa/codecapybara-4bit-128g-gptq