Subhaditya's KB

❯

❯

Swin Transformer

Swin Transformer

Sep 18, 20241 min read

temp

toc: true title: Swin Transformer

tags: [‘temp’]

Swin Transformer

Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows
- Vision Transformer
- general-purpose backbone for computer vision
- hierarchical feature representation
- linear computational complexity with respect to input image size
- shifted window based Self Attention
- address the challenges in adapting Transformer from language to vision
- limiting self-Attention computation to non-overlapping local windows while also allowing for cross-window connection
- flexibility to model at various scales
- linear computational complexity with respect to image size
- ImageNet
- COCO
- ADE20K
- The hierarchical design and the shifted window approach also prove beneficial for all Perception Architectures.
- Ratio of 1:1:3:1

Graph View

Backlinks

Chapter 12 - Transformers
ConvNeXt

Created with Quartz v4.3.1 © 2025

GitHub