aromanro/MachineLearning

Add the possibility to 'skip' layers

aromanro opened this issue · 1 comments

This would alleviate the gradient vanishing issue for deeper networks and it's fairly easy to implement, either 'residuals' style (as in ResNet https://arxiv.org/abs/1512.03385) or by simply concatenating an output from a previous layer to the output from the current layer, both being fed into the next one's input.

Things can go quite radical with this, as for example in DenseNet: https://arxiv.org/abs/1608.06993

I probably won't do it.
For the current code, it would be rather more like densenet https://arxiv.org/abs/1608.06993 and that increases the number of parameters fast.
Unless I implement also convolutional networks, I think it's not worth it.