Directly Modeling Voiced And Unvoiced Components In Speech Waveforms By Neural Networks

2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)(2016)

引用 41|浏览47
暂无评分
摘要
This paper proposes a novel acoustic model based on neural networks for statistical parametric speech synthesis. The neural network outputs parameters of a non-zero mean Gaussian process, which defines a probability density function of a speech waveform given linguistic features. The mean and covariance functions of the Gaussian process represent deterministic (voiced) and stochastic (unvoiced) components of a speech waveform, whereas the previous approach considered the unvoiced component only. Experimental results show that the proposed approach can generate speech waveforms approximating natural speech waveforms.
更多
查看译文
关键词
Statistical parametric speech synthesis,neural network,wavefom
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要