Subhaditya's KB

❯

❯

Attention

Sep 18, 20241 min read

architecture

Attention

Model can decide where to look in the input
Self Attention
Additive Attention
Dot Product Attention
Location Aware Attention
Relative Multi Head Self Attention
Soft Attention
Scaled Dot Product Attention
Encoder Decoder Attention
Multi Head Attention
Strided Attention
Fixed Factorization Attention
Sliding Window Attention
Dilated Sliding Window Attention
Global and Sliding Window Attention
Content Based Attention
Location Base Attention
Mixed chunk attention

Graph View

Backlinks

Created with Quartz v4.3.1 © 2025

GitHub