Vocaine The Vocoder And Applications In Speech Synthesis

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)(2015)

引用 99|浏览120
暂无评分
摘要
Vocoders received renewed attention recently as basic components in speech synthesis applications such as voice transformation, voice conversion and statistical parametric speech synthesis. This paper presents a new vocoder synthesizer, referred to as Vocaine, that features a novel Amplitude Modulated-Frequency Modulated (AM-FM) speech model, a new way to synthesize non-stationary sinusoids using quadratic phase splines and a super fast cosine generator. Extensive evaluations are made against several state-of-the-art methods in Copy-Synthesis and Text-To-Speech synthesis experiments. Vocaine matches or outperforms STRAIGHT in Copy-Synthesis experiments and outperforms our baseline real-time optimized Mixed-Excitation vocoder with the same computational cost. We report that Vocaine considerably improves our statistical TTS synthesizers and that our new statistical parametric synthesizer [1] matched the quality of our mature production Unit-Selection system with uncompressed waveforms.
更多
查看译文
关键词
vocoders,statistical parametric speech synthesis,text-to-speech,non-stationary,AM-FM,fast cosine generators,phase models,overlap-add,sinusoidal speech models
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要