Implementing optimizations from layla

Question

Implementing optimizations from layla

Vali-98 opened this issue 8 months ago · 2 comments

Layla is a project that also integrates llamacpp for mobile use:
https://github.com/l3utterfly/llama.cpp/tree/layla-build

After some quick testing, it does seem like Layla's fork for llamacpp runs models far faster on android than llama.rn, almost twice as fast in some cases with 7b models.

It would be wonderful in these improvements were added to llama.rn as well.

Answer 1 · 2024-04-12T11:15:37.000Z

Interesting, I just did a quick look, is it enabled CLBlast?

ggerganov/llama.cpp@master...l3utterfly:llama.cpp:layla-build

/cc @l3utterfly

Answer 2 · 2024-04-12T11:21:59.000Z

Yes, the Layla build has CLBlast enabled

…

________________________________ From: Jhen-Jie Hong ***@***.***> Sent: Friday, April 12, 2024 8:15:58 PM To: mybigday/llama.rn ***@***.***> Cc: l3utterfly ***@***.***>; Mention ***@***.***> Subject: Re: [mybigday/llama.rn] Implementing optimizations from layla (Issue #50) Interesting, I just did a quick look, is it enabled CLBlast? ***@***.***:llama.cpp:layla-build<ggerganov/llama.cpp@master...l3utterfly:llama.cpp:layla-build> /cc @l3utterfly<https://github.com/l3utterfly> — Reply to this email directly, view it on GitHub<#50 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ABVFX23B73B7EHWPLN3CM2DY46665AVCNFSM6AAAAABF3M2U5WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANJRGU3DCOBTGI>. You are receiving this because you were mentioned.Message ID: ***@***.***>