/ndr

The official repository for our paper "The Neural Data Router: Adaptive Control Flow in Transformers Improves Systematic Generalization".

Primary LanguagePythonMIT LicenseMIT

Watchers