intel-analytics/ipex-llm

ipex-llm version 0510 has regression than 0430, especially for BS=16,32 and 8k input

Closed this issue · 3 comments

Please enable low memory mode and check if this issue still exists. You could use export IPEX_LLM_LOW_MEM=1 to enable low memory mode.

Cannot reproduce the performance regression from 0430 to 0510. Maybe check the scripts and env setting with user later.

No performance regression observed in stable release validation test, close issue for now.