NVIDIA/Megatron-LM

[QUESTION] Does Megatron-Core supports LLAMA models?

Opened this issue · 5 comments

Does Megatron-Core supports LLAMA models?

yes

@ethanhe42 When transformer-impl is local, it reports the following error:
AssertionError: (RMSNorm) is not supported in FusedLayerNorm when instantiating FusedLayerNorm when instantiating TransformerLayer
When transformer-impl is transformer_engine, the following code does not seem to define RMSNorm?
image
So do I need to make any changes when I want to use llama?

You need to use mcore models. local is deprecating

@ethanhe42 When transformer-impl is set to transformer_engine, the following code does not seem to define RMSNorm?
image

It's handled by TEnorm