Speech Resynthesis
- Speech Resynthesis from Discrete Disentangled Self-Supervised Representations
- self-supervised discrete representations for the task of speech resynthesis
- separately extract low-bitrate representations for speech content, prosodic information, and speaker identity
- This allows to synthesize speech in a controllable manner
- evaluate the F0 reconstruction, speaker identification performance (for both resynthesis and voice conversion), recordings’ intelligibility, and overall quality using subjective human evaluation
- ultra-lightweight speech codec