- 学习地址:
模型名称 | 原论文 | 源代码 | 本项目测试数据集及结果 | 参考视频 | 参考文章 |
---|---|---|---|---|---|
Vision Transformer | An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale(arxiv.org,readpaper.com) | google,WZMIAOMIAO,KKKSQJ | 论文讲解,结构讲解,代码讲解 | Multi-Head Attention讲解,ViT结构讲解,zhihu.com | |
Swin Transformer | Swin Transformer: Hierarchical Vision Transformer using Shifted Windows(arxiv.org,readpaper.com) | microsoft,WZMIAOMIAO,KKKSQJ | 论文讲解,结构讲解,代码讲解 | 结构讲解,zhihu.com,CSDN博客 | |
CoAtNet | CoAtNet: Marrying Convolution and Attention for All Data Sizes(arxiv.org,readpaper.com) | KKKSQJ | 论文精读,代码讲解-上,代码讲解-下 | zhihu.com |