Subhaditya's KB

Home

❯

KB

❯

AI

❯

Machine Learning

❯

Models

❯

Dot Product Attention

Dot Product Attention

Sep 18, 20241 min read

  • architecture

Dot Product Attention

  • Luong et al., 2015
  • fatt​(hi​,sj​)=hiT​sj​
  • Equivalent to Multiplicative Attention with no trainable weight matrix. Performs better at larger dimensions
  • Identity matrix
  • h is hidden state for encoder and s is hidden state for decoder
  • A type of Attention Alignment
  • Final scores after Softmax

Graph View

Backlinks

  • Chapter 12 - Transformers
  • Attention
  • Scaled Dot Product Attention
  • _Index_of_Models
  • architecture

Created with Quartz v4.3.1 © 2025

  • GitHub