Minimal multi-head self-attention transformer architecture with experimental features implemented in PyTorch.
Primary LanguagePythonMIT LicenseMIT
No issues in this repository yet.