/SwinUnetArchitecturePytorch

The Swin-UNet is a version of the widely used U-Net architecture that combines the windowed attention mechanism of Swin transfomer with the U-Net framework.

Primary LanguageJupyter Notebook

Swin-Unet architecture implementation in Pytorch.

The Swin-UNet is a version of the widely used U-Net architecture that combines the windowed self-attention mechanism with the U-Net framework.

Untitled

The Swin-Transformer builds on the Vision-Transformer by calculating the attention limited to a local window and making use of a shifted windows for providing connections between windows that significantly enhance modeling power of the architecture.

The attention limitation to a local window allows it have a linear computation complexity to input image size against the quadratic of Vision Transformer.

The stacking of Swin-Transformer blocks allows hierarchical feature maps by merging image patches in deeper layers. Untitled (2) The shift of the window between consecutive transformer block allows for cross-window connection and thus enabling learning of finer details required in dense prediction tasks such as object detection and semantic segmentation. whifted windows approach PNG

References: