Gated Recurrent Unit (GRU)

  • Simplified [LSTM)](Long Short Term Memory (LSTM|Long Short Term Memory (LSTM|LSTM)](LSTM)](Long Short Term Memory (LSTM|Long Short Term Memory (LSTM|LSTM).md).md)
  • It has an input and forget gate, no output gate
  • Faster than LSTM in training, but does not perform well in many tasks
  • Tries to forget what is not important

The Math

  • Two gates, Sigmoid
    • Reset :
    • Update :
  • Hidden state proposal
  • Final hidden state
    • Linear Interpolation between last hidden state and proposal