yingkaisha/keras-vision-transformer

swin transformer depth

atgc1984 opened this issue · 1 comments

Hello,
Thank you for the example on MNIST classification. The demo seems to have only one downsample stage and it is enough for this case. How can I config a larger model to have more downsample stages? Shall I just add more patch_extract/patch_embedding and block loop with smaller patch size?

Thank you.

Hello, Thank you for the example on MNIST classification. The demo seems to have only one downsample stage and it is enough for this case. How can I config a larger model to have more downsample stages? Shall I just add more patch_extract/patch_embedding and block loop with smaller patch size?

Thank you.

hi have you solve the problem you proposed? I got the same question as you ,could you give me some advices