Model performance
dexKing07 opened this issue · 0 comments
dexKing07 commented
Good mornig,
i have ran with success mlc chat, but i have found two things:
- the launch sintax is different (but this was resolved quickly)
- the model performance... I achieve 1.9tok/sec for RedPajama-3b instead 5tok/sec as expected (see picture below)
Is there anything I've forgotten or something I can check to get this result? The OpenCL drivers are installed correctly, and I verified that during the model working phase it was not the CPU that was working but the GPU.
Thank you in advance for the support.