huggingface/candle

Cannot run llama example : access to source requires login credentials

dbrowne opened this issue · 4 comments

cargo run --example llama --release
warning: some crates are on edition 2021 which defaults to resolver = "2", but virtual workspaces default to resolver = "1"
note: to keep the current resolver, specify workspace.resolver = "1" in the workspace root's manifest
note: to use the edition 2021 resolver, specify workspace.resolver = "2" in the workspace root's manifest
Finished release [optimized] target(s) in 0.17s
Running target/release/examples/llama
Running on CPU, to run on GPU, build this example with --features cuda
loading the model weights from meta-llama/Llama-2-7b-hf
Error: request error: https://huggingface.co/meta-llama/Llama-2-7b-hf/resolve/main/tokenizer.json: status code 401

Caused by:
https://huggingface.co/meta-llama/Llama-2-7b-hf/resolve/main/tokenizer.json: status code 401

This is likely caused by the model being "gated": you have to accept some conditions before being able to access it, and you should be able to do so by registering on the huggingface hub, then accessing https://huggingface.co/meta-llama/Llama-2-7b-hf (and then you'll have to set up some authentication token so that permissioning will be checked).

pip install huggingface_hub
huggingface-cli login

For those who have already access to the Python ecosystem. Otherwise create a file in $HOME/.cache/huggingface/token containing your HF token.

Closing this as hopefully it's all good with the instructions above.

Might be better to write it onto readme files.