In the paper, it is mentioned that visualizing the last layer of attention graph, how is this operation done?
notfacezhi opened this issue · 4 comments
In the first notebook has the code to visualize the images that we used in the paper, including the attention matrix.
Each point in the original image (which corresponds to a line (or column) in the attention mask) can be reshaped as an image.
I believe I've answered your question and as such I'm closing this issue
hey @fmassa, thanks for the great detr work! I've been trying to replicate some of the work illustrations.
I'd expect the self-attention weights would come from the operation attn = (q*scale) @ k.T
that weighs the values. It turned out that looking at the detr repo at the Transformers classes definition:, the forward outcome only yields the final tensor of dimensions (b, h * w, c)
I don't know how you could get the hook's outcome from the colab's notebook. Is there any other code that the colab model used?
hey @fmassa, thanks for the great detr work! I've been trying to replicate some of the work illustrations.
I'd expect the self-attention weights would come from the operation
attn = (q*scale) @ k.T
that weighs the values. It turned out that looking at the detr repo at the Transformers classes definition:, the forward outcome only yields the final tensor of dimensions(b, h * w, c)
.I don't know how you could get the hook's outcome from the colab's notebook. Is there any other code that the colab model used?
Did you figure this out?