Deep network a la Covington paper

Question

Deep network a la Covington paper

aimran opened this issue 6 years ago · 3 comments

Recently came across this project. Massively impressed by the clean and intuitive API 👏

I started with the basic Pooling method since it seemed to be the simplest. Then I noticed the Covington reference -- the paper seems to imply a deeper network whereas the PoolNet uses a single layer. Am I completely misreading it (feel free to tell me to shut up :-D )

Wondering if you experimented with adding additional layers? Would be happy to dig into it otherwise.

Best
Asif

Answer 1 · 2019-02-18T16:35:31.000Z

You can definitely try stacking more layers on top. You can do this by writing your own version of the pooling layer, then passing it as the representation argument of the model constructor.

Assuming you are using the sequential models, I would strongly recommend building on top of the LSTM-based representation: it gets much better results.

Answer 2 · 2019-02-18T17:54:58.000Z

Thanks for the tip. And I am using the seq models as you'd guessed. I will start with LSTM 👍

I guess what tripped me up with Pooling was that the target and input were sharing the same item_embeddings -- this may(?) not necessarily be true if one were to add layers. Covington et al. seems to suggest that the penultimate layer becomes pseudo-embeddings of sort.

Answer 3 · 2019-02-18T18:38:41.000Z

As you say, the default implementations share the embeddings for the input and output layers. You can change that with your own representation layer. Whether or not the input and output embeddings are shared is unrelated to the number of layers; you can have a deep model without tying the embeddings. It may be worth trying both and seeing which one works better!