SparsemaxPytorch

Implementation in PyTorch of http://proceedings.mlr.press/v48/martins16.pdf (International Conference on Machine Learning 2016)

It consists in an activation function similar than Softmax but can give us sparse probabilities of inputs. Interesting to use for attentional models.

I tested replacing the softmax activation in the last layer and it gives similar results.

Coded by Max Raphael Sobroza Marques

dawnJ/SparsemaxPytorch