Opened this issue a month ago · 1 comments
Hello, when you calculated the layout loss and layout-sem loss, which cross-attention layers are the QKV features from? Do you use the features from all the cross-attention layers? I'm looking forward to your reply, thanks!
QK features, sorry.