Automatically Optimizing Utterance Classification Performance Without Human In The Loop
12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5(2011)
摘要
The Utterance Classification (UC) method has become a developer's choice over traditional Context Free Grammars (CFGs) for voice menus in telephony applications. This data driven method achieves higher accuracy and has great potential to utilize a huge amount of labeled training data. But, having a human manually label the training data can be expensive. This paper provides a robust recipe for training a UC system using inexpensive acoustic data with limited transcriptions or semantic labels. It also describes two new algorithms that use caller confirmation, which naturally occurred within a dialog, to generate pseudo semantic labels. Experimental results show that, after having sufficient labeled data to achieve a reasonable accuracy, both of our algorithms can use unlabeled data to achieve the same performance as a system trained with labeled data, while completely eliminating the need for human supervision.
更多查看译文
关键词
Call Routing, Statistical grammars, Spoken language understanding (SLU), Utterance Classification (UC)
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络