flow-based network capable of generating high quality speech from mel-spectrograms
combines insights from GLOW and WaveNet in order to provide fast, efficient and high-quality audio synthesis, without the need for auto-regression
mplemented using only a single network, trained using only a single cost function: maximizing the likelihood of the training data, which makes the training procedure simple and stable