谷歌浏览器插件
订阅小程序
在清言上使用

Investigating the Downstream Impact of Grapheme-Based Acoustic Modeling on Spoken Utterance Classification

Ryan Price, Bhargav Srinivas Ch, Surbhi Singhal,Srinivas Bangalore

Spoken Language Technology Workshop(2018)

引用 1|浏览17
暂无评分
摘要
Automatic speech recognition (ASR) and natural language understanding are critical components of spoken language understanding (SLU) systems. One obstacle to providing services with SLU systems in multiple languages is the cost associated with acquiring all of the language-specific resources required for ASR in each language. Modeling graphemes eliminates the need to obtain a pronunciation dictionary which maps from speech sounds to words and is one way to reduce ASR resource dependencies when rapidly developing ASR in new languages. However, little is known about the downstream impact on SLU task performance when selecting graphemes as the acoustic modeling unit. This work investigates acoustic modeling for the ASR component of an SLU system using grapheme-based approaches together with convolutional and recurrent neural network architectures. We evaluate both ASR word accuracy and spoken utterance classification (SUC) accuracy for English, Italian and Spanish language tasks and find that it is possible to achieve SUC accuracy that is comparable to conventional phoneme-based systems which leverage a pronunciation dictionary.
更多
查看译文
关键词
grapheme,spoken utterance classification,acoustic modeling,CTC,Lattice-Free MMI
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要