[QUESTION] Does Megatron-Core supports LLAMA models?

Question

[QUESTION] Does Megatron-Core supports LLAMA models?

Opened this issue a month ago · 5 comments

Does Megatron-Core supports LLAMA models?

yes

Answer 1 · 2024-05-04T01:27:55.000Z

@ethanhe42 When transformer-impl is local, it reports the following error：
AssertionError: (RMSNorm) is not supported in FusedLayerNorm when instantiating FusedLayerNorm when instantiating TransformerLayer
When transformer-impl is transformer_engine, the following code does not seem to define RMSNorm?

So do I need to make any changes when I want to use llama?

Answer 2 · 2024-05-04T03:14:48.000Z

You need to use mcore models. local is deprecating

Answer 3 · 2024-05-04T16:10:25.000Z

@ethanhe42 When transformer-impl is set to transformer_engine, the following code does not seem to define RMSNorm?

Answer 4 · 2024-05-05T04:03:45.000Z

It's handled by TEnorm