xxxnell/how-do-vits-work

Question about Figure 2(a)

iumyx2612 opened this issue · 6 comments

image
Looking at this figure, I'm seeing that the early layers of ResNet has many low-freq components, and the deeper ResNet goes, it contains more high-freq components. Am I interpreting this figure right?

If I'm right, isn't this a little contradict to popular belief and visualization? That early layers in a ConvNet tend to learn high-freq components?

Hi @iumyx2612,

I believe a more appropriate interpretation of this figure is that the convolutional layers consistently amplify high-frequency components. Consequently, a significant amount of high-frequency information remains in the representations of deeper layers. My emphasis was on this trend of changes.

Hi @iumyx2612,

I believe a more appropriate interpretation of this figure is that the convolutional layers consistently amplify high-frequency components. Consequently, a significant amount of high-frequency information remains in the representations of deeper layers. My emphasis was on this trend of changes.

So looking at the figure, we can't say that early layers in a ConvNet contains many low frequency components right?

It is not easy to simply compare the amount of low-frequency components from different depths of layers. Instead, I'd like to say that having many low-frequency components in a representation does not necessarily mean the layers are learning low-frequency information in this case.

It is not easy to simply compare the amount of low-frequency components from different depths of layers. Instead, I'd like to say that having many low-frequency components in a representation does not necessarily mean the layers are learning low-frequency information in this case.

Oooh, thank you I got it!

I was reading the paper: MogaNet and having difficult understanding some part in the paper. Some of the explanations contradict to my understanding so I would like to exchange something with you (through emails or other platform since it's not relevant to this github) since some parts of the paper take inspiration from your work. Is it okay? Sorry if this bothers you

I am happy to do it. Please feel free to send me an email at namuk.park@gmail.com or park.namuk@gene.com.