Multi-Corpus Acoustic-to-Articulatory Speech Inversion

Nadee Seneviratne,Ganesh Sivaraman,Carol Espy-Wilson

INTERSPEECH（2019）

引用 14|浏览46

暂无评分

摘要

There are several technologies like Electromagnetic articulometry (EMA), ultrasound, real-time Magnetic Resonance Imaging (MRI), and X-ray microbeam that are used to measure speech articulatory movements. Each of these techniques provides a different view of the vocal tract. The measurements performed using the similar techniques also differ greatly due to differences in the placement of sensors, and the anatomy of speakers. This limits most articulatory studies to single datasets. However to yield better results in its applications, the speech inversion systems should be more generalized, which requires the combination of data from multiple sources. This paper proposes a multi-task learning based deep neural network architecture for acoustic-to-articulatory speech inversion trained using three different articulatory datasets - two of them were measured using EMA, and one using X-ray microbeam. Experiments show improved accuracy of the proposed acoustic-to-articulatory mapping compared to the systems trained using single datasets.

查看译文

关键词

Acoustic-to-articulatory speech inversion, multi-task learning, articulatory phonology, tract variables

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要