Multi-Frame Amplitude Envelope Estimation for Modification of Singing Voice.

Gilles Degottex,Luc Ardaillon,Axel Roebel

IEEE/ACM Trans. Audio, Speech & Language Processing（2016）

引用 4|浏览12

暂无评分

摘要

Singingvoice synthesis benefits from very highquality estimation of the resonances and anti-resonances of the vocal tract filter (VTF), i.e., an amplitude spectral envelope. In the state of the art, a single frame of DFT transform is commonly used as a basis for building spectral envelopes. Even though multiple frame analysis (MFA) has already been suggested for envelope estimation, it is not yet used in concrete applications. Indeed, even though existing attempts have shown very interesting results, we will demonstrate that they are either over complicated or fail to satisfy the high accuracy that is necessary for singing voice. In order to allow future applications of MFA, this article aims to improve the theoretical understanding and advantages of MFA-based methods. The use of singing voice signals is very beneficial for studying MFA methods due to the fact that the VTF configuration can be relatively stable and, at the same time, the vibrato creates a regular variation that is easy to model. By simplifying and extending previous works, we also suggest and describe two MFA-based methods. To better understand the behaviors of the envelope estimates, we designed numerical measurements to assess single frame analysis and MFA methods using synthetic signals. With listening tests, we also designed two proofs of concept using pitch scaling and conversion of timbre. Both evaluations show clear and positive results for MFA-based methods, thus, encouraging this research direction for future applications.

查看译文

关键词

Cepstral analysis,Harmonic analysis,Estimation,Context,Speech,IEEE transactions,Speech processing

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要