/macaron-net

Codes for "Understanding and Improving Transformer From a Multi-Particle Dynamic System Point of View"

Primary LanguagePythonBSD 3-Clause "New" or "Revised" LicenseBSD-3-Clause

Watchers