Speech Resynthesis

  • Speech Resynthesis from Discrete Disentangled Self-Supervised Representations
  • self-supervised discrete representations for the task of speech resynthesis
  • separately extract low-bitrate representations for speech content, prosodic information, and speaker identity
  • This allows to synthesize speech in a controllable manner
  • evaluate the F0 reconstruction, speaker identification performance (for both resynthesis and voice conversion), recordings’ intelligibility, and overall quality using subjective human evaluation
  • ultra-lightweight speech codec