Vocal Tract Inversion By Cepstral Analysis-By-Synthesis Using Chain Matrices

INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5（2008）

引用 25|浏览25

暂无评分

摘要

Acoustic-to-articulatory inversion for vowels is performed by cepstral analysis-by-synthesis, using chain-matrix calculation of vocal tract (VT) acoustics and the Maeda articulatory model. The derivative of the VT chain matrix with respect to the area function was calculated in a novel efficient manner, and used in the BFGS quasi-Newton method for optimizing a distance measure between input and synthesized cepstral features over the entire articulatory trajectory. The optimization is initialized by a fast search of an articulatory codebook with a bin structure in formant space and the cost function also includes regularization and continuity terms to obtain realistic inverted VT shapes and smooth articulatory trajectories. Inversion is evaluated on the three diphthongs /ai/, /oi/ and /au/ of two speakers, one male and one female, from the University of Wisconsin X-ray microbeam (XRMB) database, and good agreement was achieved between inverted midsagittal vocal tract outlines and measured XRMB tongue and lip pellet positions, with an average relative error of less than 3% in the first three formants.

查看译文

关键词

Acoustic-to-articulatory inversion, Analysis-by-Synthesis, chain matrix, Maeda articulatory model

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要