Clip retrieval using multi-modal biometrics in meeting archives

Tampa, FL(2008)

引用 1|浏览2
暂无评分
摘要
We present a system to retrieve all clips from a meet- ing archive that show a particular individual speaking, us- ing a single face or voice sample as the query. The sys- tem incorporates three novel ideas. One, rather than match the query to each individual sample in the archive, samples within a meeting are grouped first, generating a cluster of samples per individual. The query is then matched to the cluster, taking advantage of multiple samples to yield a ro- bust decision. Two, automatic audio-visual association is performed which allows a bi-modal retrieval of clips, even when the query is uni-modal. Three, the biometric recogni- tion uses individual-specific score distributions learnt from the clusters, in a likelihood ratio based decision framewor k that obviates the need for explicit normalization or modali ty weighting. The resulting system, which is completely auto- mated, performs with 92.6% precision at 90% recall on a dataset of 16 real meetings spanning a total of 13 hours.
更多
查看译文
关键词
audio-visual systems,biometrics (access control),face recognition,image matching,image sampling,information retrieval systems,learning (artificial intelligence),pattern clustering,speaker recognition,video retrieval,automatic audio-visual association,clip retrieval,individual-specific score distribution,meeting archive,multimodal biometric recognition,query matching,robust decision framework,speaker face recognition
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要