shenweichen/DeepCTR-Torch

In the MOE method does expert have to learn and can the frozen model be used as an expert?

Harzva opened this issue · 1 comments

Describe the question(问题描述)
Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts

In the MOE method does expert have to learn and can the frozen model be used as an expert?like gpt3 bert

thanks you very much