(Experimental) Using Llama2 with Haystack
The notebook contains my hacky experiments in which I try to load and use Llama2 with Haystack, the NLP/LLM framework.
It's nothing official or well refined, but perhaps it may be useful to other people experimenting.
- Installed Transformers from the main branch (and other libraries) 📚
- Loaded Llama-2-13b-chat-hf on Colab using 4-bit quantizazion, thanks to the great material shared by Younes Belkada 🙌
- Disabled Tensor Parallelism, which caused some issues 🛠️
- Installed a minimal version of Haystack
- Found a hacky way to load the model in Haystack's PromptNode
- Had a llama-zing chat session, from 🎧🎶 David Guetta to Don Matteo ⛪📿 (an Italian TV series)!