Word-lattice based spoken-document indexing with standard text indexers

user-5d8054e8530c708f9920ccce（2008）

引用 10|浏览3

暂无评分

摘要

Indexing the spoken content of audio recordings requires automatic speech recognition, which is as of today not reliable. Unlike indexing text, we cannot reliably know from a speech recognizer whether a word is present at a given point in the audio; we can only obtain a probability for it. Correct use of these probabilities significantly improves spoken-document search accuracy. In this paper, we will first describe how to improve accuracy for "web-search style" (AND/phrase) queries into audio, by utilizing speech recognition alternates and word posterior probabilities based on word lattices. Then, we will present an end-to-end approach to doing so using standard text indexers, which by design cannot handle probabilities and unaligned alternates. We present a sequence of approximations that transform the numeric lattice-matching problem into a symbolic text-based one that can be implemented by a commercial full-text indexer. Experiments on a 170-hour lecture set show an accuracy improvement by 30-60% for phrase searches and by 130% for two-term AND queries, compared to indexing linear text.

查看译文

关键词

Audio mining,Search engine indexing,Phrase,Indexer,Speech recognition,Sound recording and reproduction,Natural language processing,Posterior probability,Computer science,Accuracy improvement,Artificial intelligence,Word lattice

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要