Support for Llama 3.1 model
Opened this issue · 3 comments
Are there instructions specific to creating a bmodel from onnx for Llama 3.1 (not lllam3)
Running this is erroring out.
python export_onnx.py --model_path ../../../../Meta-Llama-3.1-8B-Instruct/ --seq_length 1024
Convert block & block_cache
0%| | 0/32 [00:00<?, ?it/s]The attention layers in this model are transitioning from computing the RoPE embeddings internally through position_ids
(2D tensor with the indexes of the tokens), to using externally computed position_embeddings
(Tuple of tensors, containing cos and sin). In v4.45 position_ids
will be removed and position_embeddings
will be mandatory.
we are supporting llama 3.1, please be patient, thanks~
LLM-TPU/models/Llama3_1/compile/export_onnx.py does not exist.. (according to documentation it should)
pip install --upgrade transformers to version 4.44.0
Copying the one from Llama3 and running it ..is getting error
The attention layers in this model are transitioning from computing the RoPE embeddings internally through position_ids
(2D tensor with the indexes of the tokens), to using externally computed position_embeddings
(Tuple of tensors, containing cos and sin). In v4.45 position_ids
will be removed and position_embeddings
will be mandatory.
AttributeError: 'tuple' object has no attribute 'update'
In interim can you make the bmodel available
python3 -m dfss --url=open@sophgo.com:/ext_model_information/LLM/LLM-TPU/llama3.1-8b_int4_1dev_seq512.bmodel
Currently it says file not found.
python3 -m dfss --url=open@sophgo.com:/ext_model_information/LLM/LLM-TPU/llama3.1-8b_int8_1dev_seq512.bmodel
python3 -m dfss --url=open@sophgo.com:/ext_model_information/LLM/LLM-TPU/llama3.1-8b_int8_1dev_seq1024.bmodel
python3 -m dfss --url=open@sophgo.com:/ext_model_information/LLM/LLM-TPU/llama3.1-8b_int8_1dev_seq2048.bmodel
python3 -m dfss --url=open@sophgo.com:/ext_model_information/LLM/LLM-TPU/llama3.1-8b_int8_1dev_seq4096.bmodel
is available