Will it compare performance with llama-moe?
ccccj opened this issue · 1 comments
ccccj commented
llama-moe:
https://github.com/pjlab-sys4nlp/llama-moe/tree/main
Or will a training framework be released with llama as the base model?
DeepSeekDDM commented
Our DeepSeekMoE model is trained from scratch, but not resumed or initialized from the LLaMA/LLaMA2 checkpoints.
As for absolute performance, you can refer to our and their papers for details.