Subhaditya's KB

Home

❯

KB

❯

AI

❯

Machine Learning

❯

Models

❯

Multiplicative Attention

Multiplicative Attention

Oct 14, 20251 min read

  • architecture

Multiplicative Attention

  • fatt​(hi​,sj​)=hiT​Wa​sj​
  • Since Additive Attention performs better for scale, use a factor Scaled Dot Product Attention

Graph View

Backlinks

  • Dot Product Attention
  • _Index_of_Models
  • __Index_of__Models
  • architecture

Created with Quartz v4.5.1 © 2025

  • GitHub
  • LinkedIn