Subhaditya's KB

❯

❯

Google NMT

Sep 18, 20241 min read

architecture

Google NMT

Google’s Neural Machine Translation System: Bridging the Gap Between Human and Machine Translation
- deep [LSTM)](Long Short Term Memory (LSTM|Long Short Term Memory (LSTM|LSTM)](LSTM)](Long Short Term Memory (LSTM|Long Short Term Memory (LSTM|LSTM).md).md) network with 8 encoder and 8 decoder Layers using Attention and residual connections
- improve parallelism and therefore decrease training time, their Attention mechanism connects the bottom layer of the decoder to the top layer of the encoder
- low-precision arithmetic during inference computations (FP16 training ???)
- improve handling of rare words, we divide words into a limited set of common sub-word units
- good balance between the flexibility of “character”-delimited models and the efficiency of “word”-delimited models
- Beam search technique employs a length-normalization procedure and uses a coverage penalty

Graph View

Backlinks

architecture

Created with Quartz v4.3.1 © 2025

GitHub