Reshaping The Transformed Lf Model: Generating The Glottal Source From The Waveshape Parameter R-D

Christer Gobl

18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION（2017）

引用 6|浏览3

暂无评分

摘要

Precise specification of the voice source would facilitate better modelling of expressive nuances in human spoken interaction. This paper focuses on the transformed version of the widely used LF voice source model, and proposes an algorithm which makes it possible to use the waveshape parameter R-d to directly control the LF pulse, for more effective analysis and synthesis of voice modulations. The R-d parameter, capturing much of the natural covariation between glottal parameters, is central to the transformed LF model. It is used to predict the standard R-parameters, which in turn are used to synthesise the LF waveform. However, the LF pulse that results from these predictions may have an R-d value noticeably different from the specified R-d, yielding undesirable artefacts, particularly when the model is used for detailed analysis and synthesis of non-modal voice. A further limitation is that only a subset of possible R-d values can be used, to avoid conflicting LF parameter settings. To eliminate these problems, a new iterative algorithm was developed based on the Newton-Raphson method for two variables, but modified to include constraints. This ensures that the correct R-d is always obtained and that the algorithm converges for effectively all permissible R-d values.

查看译文

关键词

transformed LF model, R-d parameter, glottal, voice source, Newton-Raphson method

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要