Towards Vocabulary-Independent Speech Indexing For Large-Scale Repositories

INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5(2008)

引用 28|浏览26
暂无评分
摘要
The Out-Of-Vocabulary problem remains a challenge for word-lattice-based speech indexing. Sub-word-based approaches address this problem effectively for small-scale tasks, but suffer from poor precisions on large-scale databases due to lack of strong language model constraints. We propose a method for searching OOV queries with large-scale databases in two steps. First, result candidates are extracted from a sub-word-based system, ensuring a high recall. The candidates are then refined by word-lattice rescoring aiming at a high precision. Experiments on a 160-hours lecture set show that the proposed approach achieves a relative improvement of 8.7% over the sub-word-based baseline, and 19.7% for only single-word queries.
更多
查看译文
关键词
Out-Of-Vocabulary, Keyword Spotting, Word Lattice, Phonetic Lattice, Large Scale
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要