A Spectral Energy Distance for Parallel Speech Synthesis
NIPS 2020, 2020.
We propose a new learning method that allows us to train highly parallel models of speech, without requiring access to an analytical likelihood function
Speech synthesis is an important practical generative modeling problem that has seen great progress over the last few years, with likelihood-based autoregressive neural models now outperforming traditional concatenative systems. A downside of such autoregressive models is that they require executing tens of thousands of sequential opera...More
PPT (Upload PPT)