alexa/dialoglue

About Observers in the paper

Opened this issue · 3 comments

Hi Mehri,
Thanks for your awesome work 'Example-Driven Intent Prediction with Observers', and your open sourcing codebase.
How did you add observers to bert model in your codebase? I can't find what is related to [OBS]. Did you use the [PAD] as the [OBS]? And how did you make Observers the tokens that are not attended to?
Looking forward to your reply.

@mihail-amazon Hi, could you help?

Apologies for the extremely late reply. I'm not an official collaborator on this repo, so I did not get a notification about your issue. To answer your question, yes [PAD] tokens were used as [OBS]. By default, [PAD] tokens attend to all other tokens however they are masked by all other tokens. This line in the code is responsible for averaging over the necessary number of observers:

pooled_output = hidden_states[:, -self.num_observers:].mean(dim=1)

Got it! I have also noticed it. Thank you.