Could anyone port deepseek-moe to llama2.c?
win10ogod opened this issue · 0 comments
win10ogod commented
It employs an innovative MoE architecture, which involves two principal strategies: fine-grained expert segmentation and shared experts isolation.
https://github.com/deepseek-ai/DeepSeek-MoE/tree/main
https://huggingface.co/deepseek-ai/deepseek-moe-16b-chat