nouhadziri/THRED

Missing message-level attention?

Closed this issue · 1 comments

Hello,

Thanks for releasing this codebase. I was reading your paper about the THRED model (https://arxiv.org/pdf/1811.01063.pdf) and I've noticed that in the generation process you compute two different attention mechanisms: the message attention to generate a representation for the utterances and a context-level attention to generate the context vector in classical HRED model. It looks to me that in the actual implementation the message-level attention is missing: https://github.com/nouhadziri/THRED/blob/master/models/thred/thred_model.py#L212

Is there any reason for this? Did you notice better performance with just the context-level attention?

Thanks a lot for your answer!

Alessandro

Thanks for raising the issue. We refactored the original code before releasing. Message attention was a bit messy. We didn't have a chance to adapt it to the released structure.
Our observation was that not much improvement can be gained with the message attention while it makes the model larger and thus, the training slower.

Hope this helps, thanks.