Audio hot spotting and retrieval using multiple features

Qian Hu, Fred Goodman,Stanley Boykin,Randy Fish,Warren Greiff

SpeechIR '04 Proceedings of the Workshop on Interdisciplinary Approaches to Speech Indexing and Retrieval at HLT-NAACL 2004（2004）

引用 5|浏览0

暂无评分

摘要

This paper reports our on-going efforts to exploit multiple features derived from an audio stream using source material such as broadcast news, teleconferences, and meetings. These features are derived from algorithms including automatic speech recognition, automatic speech indexing, speaker identification, prosodic and audio feature extraction. We describe our research prototype -- the Audio Hot Spotting System -- that allows users to query and retrieve data from multimedia sources utilizing these multiple features. The system aims to accurately find segments of user interest, i.e., audio hot spots within seconds of the actual event. In addition to spoken keywords, the system also retrieves audio hot spots by speaker identity, word spoken by a specific speaker, a change of speech rate, and other non-lexical features, including applause and laughter. Finally, we discuss our approach to semantic, morphological, phonetic query expansion to improve audio retrieval performance and to access cross-lingual data.

查看译文

关键词

multiple feature,audio feature extraction,audio hot spot,audio retrieval performance,audio stream,retrieves audio hot spot,automatic speech indexing,automatic speech recognition,speaker identification,speaker identity

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要