Chrisz236/llm-rk3588

Model performance

dexKing07 opened this issue · 0 comments

Good mornig,
i have ran with success mlc chat, but i have found two things:

  • the launch sintax is different (but this was resolved quickly)
  • the model performance... I achieve 1.9tok/sec for RedPajama-3b instead 5tok/sec as expected (see picture below)

Is there anything I've forgotten or something I can check to get this result? The OpenCL drivers are installed correctly, and I verified that during the model working phase it was not the CPU that was working but the GPU.

Thank you in advance for the support.

Immagine