ortho-rnn Investigating a soft constraint on the recurrent weight matrix orthogonality. Comparison to gradient clipping on sequential MNIST.