RuntimeError: CUDA error: device-side assert triggered when using Llama 2 from HF
andreasbinder opened this issue · 3 comments
Good Day!
I tried to run the GSM8k example with the model from HF as you described: (only adjust the log and prompt paths)
CUDA_VISIBLE_DEVICES=0,1 python examples/rap_gsm8k/inference.py --base_lm hf --hf_path meta-llama/Llama-2-70b-hf --hf_peft_path None --hf_quantized 'nf4'
However, I receive the following error
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
I think this is related to the warning also mentioned in the log trace:
llm-reasoners/reasoners/lm/hf_model.py:137: UserWarning: the eos_token '\n' is encoded into [29871, 13] with length != 1, using 13 as the eos_token_id
warnings.warn(f'the eos_token {repr(token)} is encoded into {tokenized} with length != 1, '
When searching on GitHub, I think it is related to input mismatching due to some false tokenisation 1 2 3.
Did you also encounter this problem or how did you go about it?
I will try the other versions of Llama in the meantime.
I am using transformers 4.33.1
Thx!
Hi, for the CUDA error, could you try following the message? For debugging consider passing CUDA_LAUNCH_BLOCKING=1
The warning you showed shouldn't matter. It's expected in this example. We want the generation to stop at \n
, and 13 is the token index of \n
. For some reason, it's encoded into 2 tokens ([29871, 13]
), so we just use 13 as the eos_token.
please send us with more detailed information of error since the RuntimeError and warning cannot provide enough information. we are delighted to help you with our work:p
Hi! I am sorry for the late reply :(
I worked with TheBloke/Llama-2-13B-GPTQ
for most experiments so far. I now tried Llama 2 again, and I did not run into a problem this time ^^
In case I find the encounter the error and the corresponding solution too, I will let you know!