H-Transformer for Cross-Attention?

Question

Vbansal21 opened this issue 3 years ago · 4 comments

Is it possible to use this architecture for Cross attention?

Answer 1 · 2021-08-06T14:19:25.000Z

It's not unfortunately :(

Answer 2 · 2021-08-06T14:30:37.000Z

What if the the code was altered to make it support cross attention? Would that give any meaningful result?

Answer 3 · 2021-08-06T19:12:29.000Z

@Vbansal21 it's not possible, since there is no notion of locality between the source and target

Answer 4 · 2021-08-07T04:11:34.000Z

Okay. Well, then closing this issue.