Multipulse Sequences For Residual Signal Modeling

12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5(2011)

引用 3|浏览23
暂无评分
摘要
In source-filter models of speech production, the residual signal - what remains after passing the speech signal through the inverse filter - contains important information for the generation of naturally sounding re-synthesized speech. Typically, the voiced regions of residual signals are regarded as a mixture of glottal pulse and noise. This paper introduces a novel approach to represent the noise component of voiced regions of residual signals through autoregressive filtering of multipulse sequences. The positions and amplitudes of the non-zero samples of these multipulse signals are optimized through a closed-loop procedure. The method in question is applied to excitation modeling in statistical parametric synthesis. Experimental results indicate that the use of multipulse-based noise component construction eliminates the necessity of run-time ad hoc procedures such as high-pass filtering and time modulation, common on excitation models for statistical parametric synthesizers, with no loss of synthesized speech quality.
更多
查看译文
关键词
speech synthesis, source modeling, noise modeling, speech analysis, speech coding
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要