Run neural-chat 7b inference with Deepspeed on Flex 140. #10507

Question

Run neural-chat 7b inference with Deepspeed on Flex 140. #10507

Closed this issue 25 days ago · 4 comments

Hi,

After review the previous issue : #10507.

We tested on Flex140 same suggestion, we get the performance very slow on Flex140 with both GPU running.

The xpu-smi GPU just usage very low. Suppose is 12GB/2 each GPU is 6GB.
Attached the platform setup configuration, see any need to addon.

Answer 1 · 2024-05-13T02:07:51.000Z

Hi, @weiseng-yeap , we had some update in our env-check script. Could you please try the new script (https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/scripts) and attach related information?

Besides, the used GPU memory is mainly related to the model size, applied precision and input length. According to your screenshot, the GPU power is very low.

Answer 2 · 2024-05-13T02:47:16.000Z

Hi BinBin

Using latest script with attached latest log.
Env_V2.txt

Answer 3 · 2024-05-13T03:13:29.000Z

Using latest script with attached latest log. Env_V2.txt

It seems intel-fw-gpu and intel-i915-dkms is not installed. Please try sudo apt install intel-i915-dkms intel-fw-gpu first.

Answer 4 · 2024-05-13T06:04:34.000Z

Attached with latest env.
Uploading Env_V3.txt…