Highway networks implemented in PyTorch.
Just the MNIST example from PyTorch hacked to work with Highway layers.
Make the Highwaynn.Module
reuseable and configurable.- Why does softmax work better than sigmoid? This shouldn't be the case...
- Make training graphs on the MNIST dataset.
- Add convolutional highway networks.
- Add recurrent highway networks.
- Experiment with convolutional highway networks for character embeddings.
- ELU doesn't work better than RELU for the layer activation.
- Softmax seems to work better than sigmoid for the gate function?!