Clip retrieval using multi-modal biometrics in meeting archives

Himanshu Vajaria,Sudeep Sarkar,Rangachar Kasturi

Tampa, FL（2008）

引用 1|浏览2

暂无评分

摘要

We present a system to retrieve all clips from a meet- ing archive that show a particular individual speaking, us- ing a single face or voice sample as the query. The sys- tem incorporates three novel ideas. One, rather than match the query to each individual sample in the archive, samples within a meeting are grouped first, generating a cluster of samples per individual. The query is then matched to the cluster, taking advantage of multiple samples to yield a ro- bust decision. Two, automatic audio-visual association is performed which allows a bi-modal retrieval of clips, even when the query is uni-modal. Three, the biometric recogni- tion uses individual-specific score distributions learnt from the clusters, in a likelihood ratio based decision framewor k that obviates the need for explicit normalization or modali ty weighting. The resulting system, which is completely auto- mated, performs with 92.6% precision at 90% recall on a dataset of 16 real meetings spanning a total of 13 hours.

查看译文

关键词

audio-visual systems,biometrics (access control),face recognition,image matching,image sampling,information retrieval systems,learning (artificial intelligence),pattern clustering,speaker recognition,video retrieval,automatic audio-visual association,clip retrieval,individual-specific score distribution,meeting archive,multimodal biometric recognition,query matching,robust decision framework,speaker face recognition

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要