lucidrains/mixture-of-attention
Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-experts
PythonMIT
Stargazers
- aieveryday
- AuroraMasterHangzhou
- cliangyuStanford University
- EmsuReevo
- EntodiEmory University
- fei-aiartXidian University
- fly51flyPRIS
- fredmonroesplat.ai
- gabegomes
- generic-github-userUnited States
- hellbellNaver AI Lab
- HusseinLezzaik@OpenMined
- IanYeungPolyU
- isLinXuRUC,Tencent
- jinny1208POSTECH
- jobrentin1https://github.com/jobrentin1
- jrfMayo Clinic -- @MCGAIL
- JuseokSeong
- kyegomezSwarms
- leonard-gleyzerBrown University
- lilingxi01@ContextualAI
- mattsta@carrierdb
- nasa03Neo-Tokyo
- Pent
- robflynnyhThe University of Sheffield
- Ryu1845
- s3nh
- Sewiahho
- sustcsonglinMIT
- theerfanLos Angeles, CA
- TheShadow29Meta
- VPeterVThe Hong Kong University of Science and Technology (HKUST)
- wardnathFedEx Dataworks
- willard-yuan@meituan -> @kwai
- Yuan-ManXShanghai, China
- zexuanqiu