[FEATURE]: Is it Possible to integrate Liger-Kernel?

Question

[FEATURE]: Is it Possible to integrate Liger-Kernel?

Opened this issue 4 months ago · 8 comments

Describe the feature

https://github.com/linkedin/Liger-Kernel

Liger Kernel is a collection of Triton kernels designed specifically for LLM training. It can effectively increase multi-GPU training throughput by 20% and reduce memory usage by 60%. We have implemented Hugging Face Compatible RMSNorm, RoPE, SwiGLU, CrossEntropy, FusedLinearCrossEntropy, and more to come.

Answer 1 · 2024-09-08T09:23:02.000Z

Seems like a pretty light-weight library. cc @ver217 @isky-cd Any take on this? 😃

Answer 2 · 2024-09-09T12:33:42.000Z

Liger Kernel is a collection of Triton kernels designed specifically for LLM training. It can effectively increase multi-GPU training throughput by 20% and reduce memory usage by 60%. We have implemented Hugging Face Compatible RMSNorm, RoPE, SwiGLU, CrossEntropy, FusedLinearCrossEntropy, and more to come.

I think this is a good attempt.

Answer 3 · 2024-09-10T09:13:23.000Z

Does it compare with apex's implementation? We've integrate some apex cuda kernels and some of them are also implemented in Liger-kernel.

Answer 4 · 2024-09-14T19:52:59.000Z

I think Apex only provides fused RMSNorm and LayerNorm kernels? They have some more

Answer 5 · 2024-10-09T05:57:42.000Z

Any good news? Thanks a lot

Answer 6 · 2024-10-09T14:18:10.000Z

I think they are short-handed in wrapping up Zero Bubble, hybrid seq parallel and then they will focus on accelerate intergration?
Feel free to ask other members to clarify further, but it'd be great if the community can make an initial PR on this, then we can help/comment. This is an open-source initiative after all, and we always welcome contributions🙂

Answer 7 · 2024-11-17T02:07:48.000Z

Is there any documentation available on how to integrate a new kernel? So we can make a PR on this. Thanks a lot.

Answer 8 · 2025-01-04T16:00:04.000Z

May I ask if you have a concept similar to Hugging Face's modeling_xxx for organizing modules? I have experience adding the liger-kernel there, and perhaps directly replacing the relevant modules could work.