Subhaditya's KB

Home

❯

KB

❯

AI

❯

Machine Learning

❯

Models

❯

Attention

Attention

Sep 18, 20241 min read

  • architecture

Attention

  • Model can decide where to look in the input
  • Self Attention
  • Additive Attention
  • Dot Product Attention
  • Location Aware Attention
  • Relative Multi Head Self Attention
  • Soft Attention
  • Scaled Dot Product Attention
  • Encoder Decoder Attention
  • Multi Head Attention
  • Strided Attention
  • Fixed Factorization Attention
  • Sliding Window Attention
  • Dilated Sliding Window Attention
  • Global and Sliding Window Attention
  • Content Based Attention
  • Location Base Attention
  • Mixed chunk attention

Graph View

Backlinks

  • ALBERT
  • Additive Attention
  • Attention NMT
  • BERT
  • Bahdanau Attention
  • Basic Transformer
  • Big Bird
  • Content Based Attention
  • ConvBERT
  • ConvNeXt
  • CvT
  • DeiT
  • Dot Product Attention
  • Encoder Decoder Attention
  • Faster RCNN
  • Fixed Factorization Attention
  • Flamingo
  • GAU
  • Global and Sliding Window Attention
  • Google NMT
  • Interpreting Attention
  • Listen Attend Spell
  • Location Aware Attention
  • Location Base Attention
  • Longformer
  • Mixed chunk attention
  • Multi Head Attention
  • Multiplicative Attention
  • RETRO
  • Self Attention
  • Sliding Window Attention
  • Soft Attention
  • Strided Attention
  • Swin Transformer
  • Transformer
  • _Index_of_Models
  • architecture
  • cross-layer parameter sharing
  • dGSLM
  • i-Code
  • Attention Based Distillation
  • Feature Based Knowledge
  • Attention Alignment
  • Attentions and salience
  • Rescorla-Wagner Algorithm
  • Dopamine
  • Gamma Waves
  • Nootropics
  • Thalamus
  • Gaze position
  • Inattentional Blindness

Created with Quartz v4.3.1 © 2025

  • GitHub