Question about training details

Question

Question about training details

Opened this issue a month ago · 1 comments

Hello, when you calculated the layout loss and layout-sem loss, which cross-attention layers are the QKV features from? Do you use the features from all the cross-attention layers? I'm looking forward to your reply, thanks!

Answer 1 · 2024-05-30T01:45:09.000Z

QK features, sorry.