Implementing optimizations from layla
Vali-98 opened this issue · 2 comments
Vali-98 commented
Layla is a project that also integrates llamacpp for mobile use:
https://github.com/l3utterfly/llama.cpp/tree/layla-build
After some quick testing, it does seem like Layla's fork for llamacpp runs models far faster on android than llama.rn, almost twice as fast in some cases with 7b models.
It would be wonderful in these improvements were added to llama.rn as well.
jhen0409 commented
Interesting, I just did a quick look, is it enabled CLBlast?
ggerganov/llama.cpp@master...l3utterfly:llama.cpp:layla-build
/cc @l3utterfly
l3utterfly commented
Yes, the Layla build has CLBlast enabled
…________________________________
From: Jhen-Jie Hong ***@***.***>
Sent: Friday, April 12, 2024 8:15:58 PM
To: mybigday/llama.rn ***@***.***>
Cc: l3utterfly ***@***.***>; Mention ***@***.***>
Subject: Re: [mybigday/llama.rn] Implementing optimizations from layla (Issue #50)
Interesting, I just did a quick look, is it enabled CLBlast?
***@***.***:llama.cpp:layla-build<ggerganov/llama.cpp@master...l3utterfly:llama.cpp:layla-build>
/cc @l3utterfly<https://github.com/l3utterfly>
—
Reply to this email directly, view it on GitHub<#50 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ABVFX23B73B7EHWPLN3CM2DY46665AVCNFSM6AAAAABF3M2U5WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANJRGU3DCOBTGI>.
You are receiving this because you were mentioned.Message ID: ***@***.***>