Subhaditya's KB

Home

❯

KB

❯

AI

❯

Machine Learning

❯

Models

❯

Dilated Sliding Window Attention

Dilated Sliding Window Attention

Sep 18, 20241 min read

  • architecture

Dilated Sliding Window Attention

  • Analgous to dilated CNN
  • Assuming a fixed d and w for all Layers, Receptive field is l×d×w which can reach tens of thousands of tokens even with small values of d

Graph View

Backlinks

  • Attention
  • Global and Sliding Window Attention
  • Longformer
  • _Index_of_Models
  • architecture

Created with Quartz v4.3.1 © 2025

  • GitHub