Similar Word Model for Unfrequent Word Enhancement in Speech Recognition.

IEEE/ACM Trans. Audio, Speech & Language Processing(2016)

引用 4|浏览26
暂无评分
摘要
The popular n-gram language model LM is weak for unfrequent words. Conventional approaches such as class-based LMs pre-define some sharing structures e.g., word classes to solve the problem. However, defining such structures requires prior knowledge, and the context sharing based on these structures is generally inaccurate. This paper presents a novel similar word model to enhance unfrequent words. In principle, we enrich the context of an unfrequent word by borrowing context information from some “similar words.” Compared to conventional class-based methods, this new approach offers a fine-grained context sharing by referring to words that best match the target word, and it is more flexible as no sharing structures need to be defined by hand. Experiments on a large-scale Chinese speech recognition task demonstrated that the similar word approach can improve performance on unfrequent words significantly, while keeping the performance on general tasks almost unchanged.
更多
查看译文
关键词
Context,Speech recognition,Speech,Probability,Training,Adaptation models,IEEE transactions
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要