谷歌浏览器插件
订阅小程序
在清言上使用

A dynamic tonal perception model for optimal pitch stylization

Computer Speech & Language(2013)

引用 21|浏览5
暂无评分
摘要
utomatic pitch stylization is an important resource for researchers working both on prosody and speech technologies. In order to be useful, the stylized F"0 curve should contain the fewest possible number of control points while remaining, at the same time, close to the original curve from a perceptual point of view. Here, a pitch stylization algorithm aimed at finding the optimal balance between the number of employed control points and perceptual equality with respect to the original curve is presented. Rather than being defined by means of statistical closeness to the original F"0 curve, the quality of the stylized curve is defined on the basis of a dynamic tonal perception model. The number of control points is optimized on the basis of previous results showing that the stylization can be more radical in those areas of the signal where tone perception is less accurate, i.e. in non-prominent areas. Perceptual tests show that, concerning the perceptual equality of the stylization, this approach performs as well as other reference ones, with the advantage of using a significantly lower number of control points. Although it is based on a theoretical background employing phonological units like syllables, the proposed, phonetic, approach does not require any preliminary segmentation or annotation step. It combines, instead, acoustic parameters related to syllabification and prominence detection into a single model which has been designed to be both integrated, in the sense that it does not introduce any pitfalls in the process, and dynamic, in the sense that it does not include rigid tonal perception thresholds.
更多
查看译文
关键词
Pitch stylization,Tonal perception,Pitch and energy interaction
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要