/Attention-Is-All-You-Need-Mesh-Tensorflow

An implementation of the paper "Attention Is All You Need" by Vaswani et al. (Google Brain) using Mesh-Tensorflow.

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

Watchers