MultiHead Attention Implementation with EinsumDense

This repro implements the MHA custom layer with the tf.keras.layers.experimental.EinsumDense. The reference MHA implementation is from the Tensorflow tutorial.