A question
Closed this issue · 3 comments
Dumeowmeow commented
Thank you for your great work!But I am a little confused about formula 5 in the paper.Why add Muser to Mseg?I think Muser is larger than Mseg, so what's the point of this addition operation, why not just use Muser?
Shilin-LU commented
Thank you. It is the XOR operation rather than addition.
Dumeowmeow commented
Thank you.And in the code, why only replace cross-attention's forward function rather than self-attention in register_attention_control?The paper mentioned the self-attention.
Shilin-LU commented
no, in our code, both self-attention and cross-attention are composed and injected.