alexpashevich/E.T.
Episodic Transformer (E.T.) is a novel attention-based architecture for vision-and-language navigation. E.T. is based on a multimodal transformer that encodes language inputs and the full episode history of visual observations and actions.
CMIT
Stargazers
- Anirudh257PhD at CRCV, UCF
- anisha2102University of Texas at Austin
- arjunakulaGoogle
- Buzz-BeaterVCLA@UCLA
- davidnvqJapan
- dsx-aishanghai
- emigmoTsinghua University
- fly51flyPRIS
- g-jing
- GasoonjiaFrom Beijing, live in San Jose
- guhurArgile
- Haoyu6iu
- hhu06
- Jackie-Chou
- jeasinemaUCLA
- JiayunjieJYJ
- Jielin-QiuCarnegie Mellon University
- mattdeitke@allenai ✗ udub
- MinghuiChen43University of British Columbia
- MohitShridharGoogle DeepMind
- mxu34Carnegie Mellon University
- nikepupu
- oztc@onemee_ai
- pranaygupta36Robotics Institute, Carnegie Mellon University
- SeanJiaUC San Diego
- stevenlswUniversity of Illinois Urbana-Champaign
- wang-sj16Brown University
- wkqly
- xhding1997
- yizhouzhao
- ysymythPrinceton University
- YTEP-ZHIThe Chinese University of Hong Kong
- YujieLu10UC Santa Barbara
- Yutong-Zhou-cvGermany
- zfchenUniqueThe University of Hong Kong
- zl9501