Find a home for HiPPO
Closed this issue · 5 comments
RTFM, but the gist is likely to be:
- Pick a instance size G4DNXL (smallest GPU)
- Create a job that listens to slack web socket
- Script up the dependencies for the default ML runtime (pip installs, etc.)
- Figure out where to put the model for invocation (ideally MLFlow)
Unfortunately, the newest databricks runtime 13.3 LTS ML uses a cuda version of 11.4 (via nvidia-smi), but the earliest installation allowed by MLC is 11.6.
Fortunately, databricks released a blog about how to run llama2. https://www.databricks.com/blog/building-your-generative-ai-apps-metas-llama-2-and-databricks
It hasn't worked perfectly out of the box, but I'll keep cracking at it.
OSError: meta-llama/Llama-2-7b-chat-hf is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
If this is a private repository, make sure to pass a token having permission to this repo withuse_auth_token
or log in withhuggingface-cli login
and passuse_auth_token=True
.
- Create a huggingface 🤗 account: https://huggingface.co/join
- Request access to Llama 2: https://ai.meta.com/resources/models-and-libraries/llama-downloads/
- Request access to gated model on huggingface: https://huggingface.co/meta-llama/Llama-2-7b-chat-hf
It has been decided we will run HiPPO in databricks w/ MLFlow