Part of the appendix of the master's thesis from Jonas Emmert. Inspired by https://youtu.be/87kLfzmYBy8?si=dhAgC-6mKGZenefM.
Scripts for visualization are provided for linear activation, tanh activation as well as activation with recitfied linear units (ReLU).
Hidden state initialization can be changed with hs
. Gradients are initialized by dh
.