Seq2Seq

  • Basic RNN Architectures
  • Long term dependency Issues
  • Even if hidden state vector has a high dimensionality, cannot hold all info
  • Sequence to Sequence Learning with Neural Networks
  • encoder-decoder learning to map sequences to sequences
  • multilayered Long Short-Term Memory [LSTM)](Long Short Term Memory (LSTM|Long Short Term Memory (LSTM|LSTM)](LSTM)](Long Short Term Memory (LSTM|Long Short Term Memory (LSTM|LSTM).md).md)
  • large deep LSTM with a limited vocabulary can outperform a standard statistical machine translation (SMT)-based system whose vocabulary is unlimited on a large-scale MT task
  • WMT14
  • BLEU score
  • reversing the order of the words in all source sentences (but not target sentences) improved the LSTM’s performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier