intel-analytics/ipex-llm

Support both Llama2 and stablelm/Zephyr-3B

Closed this issue · 2 comments

llama2 could work with transformers <= 4.37.2, but it has issue with transformers>=4.38.0, error message is:
TypeError: llama_model_forward_4_36() got an unexpected keyword argument 'cache_position'

On the other side, stablelm/zephyr-3B requires 4.38.0 to run. For transformers <= 4.37.2, it will report:
The checkpoint you are trying to load has model type stablelm but Transformers does not recognize this architecture

we need to run both zephyr-3B and llama2 models, no transformers version could support both in ipex-llm environment.

We could support llama2-7b with transformers 4.38.0.

Thank you very much! Currently 0516 + 4.38.0 performance is comparable with 0512+4.37.2 performance. I'm closing this issue.