mlpc-ucsd/CoaT

Viz of attention maps

Opened this issue · 3 comments

@yix081 @xwjabc thanks for you work, it has helped me a lot but had few queries

  1. Can we visualize the attention maps like gradcam / cam to see how the model is learning / learned? do you have a codebase to it or can you suggest how to do it ?
  2. Coat Lite has only serial block and Coat has serial+parallel blocks but the #params Coat Liter is higher than Coat is there any specific reason for this
  3. How to reduce the #params in the CoatLite/Coat <3M drop in accuracy is acceptable
    Thanks in advance

Hi @abhigoku10, thank you for your interest in our work!

  1. It is okay to do visualization on CoaT using CAM / GradCAM. However, if your aim is to visualize the attention map in CoaT, it might be a bit difficult: there is no explicit attention map in our attention mechanism since we compute the product of K and V first, thus you may not be able to extract the attention map directly. However, you can mimic the standard self-attention and manually compute the product of Q and K to generate the attention map.

  2. This is because we have different channel settings between CoaT and CoaT-Lite. We try to align the parameters of CoaT and CoaT-Lite for roughly head-to-head comparison, but there still could be some gap. You may find that in Tiny and Mini models, CoaT has slightly less parameters, but in Small models, CoaT-Lite has less parameters.

  3. I would suggest to reduce the channels in CoaT-Lite Tiny first. You can try to set a series of ratio t (e.g., t=0.3, 0.5, 0.7, 0.9). Then, multiply all channels by ratio t and train a model (perhaps on a subset of ImageNet if there are not enough computational resources). Draw the curve for validation accuracy of these models and analyze the accuracy drop w.r.t. parameters reduction. You may try to use other ways to reduce the parameters (e.g., reduce blocks or reduce channels in certain blocks) and compare the generated curves to obtain the best practice.

@xwjabc thanks for the response ,
3. can you let me knw in the code where i have to make the changes ? it would bbe helpful

You may try to modify the value of embed_dims in https://github.com/mlpc-ucsd/CoaT/blob/main/src/models/coat.py#L609