Investigating a soft constraint on the recurrent weight matrix orthogonality. Comparison to gradient clipping on sequential MNIST.
Primary LanguagePython