MotorCityCobra/MoE_stacked_w_Attention

Three mixture of expert models through a multi-head attention into a final mixture of experts model

Jupyter Notebook

Readme
0Issues
1Stargazer
0Watchers

Stargazers

MotorCityCobra

Contact site admin: Geeks.