Subhaditya's KB

Home

❯

KB

❯

AI

❯

Machine Learning

❯

Models

❯

Global and Sliding Window Attention

Global and Sliding Window Attention

Sep 18, 20241 min read

  • architecture

Global and Sliding Window Attention

  • Sliding Window Attention and Dilated Sliding Window Attention are not always enough
  • global Attention” on few pre-selected input locations.
  • This Attention is operation symmetric: that is, a token with a global Attention attends to all tokens across the sequence, and all tokens in the sequence attend to it

Graph View

Backlinks

  • Attention
  • Longformer
  • _Index_of_Models
  • architecture

Created with Quartz v4.3.1 © 2025

  • GitHub