Automatic Facial Mage Annotation And Retrieval By Integrating Voice Label And Visual Appearance

Hong-Wun Jheng,Bor-Chun Chen,Yan-Ying Chen,Winston Hsu

MM '14: 2014 ACM Multimedia Conference Orlando Florida USA November, 2014（2014）

引用 6|浏览56

暂无评分

摘要

Annotation is important for managing and retrieving a large amount of photos, but it is generally labor-intensive and Lime-consuming. However, speaking while taking photos is straightforward and effortless, and using voice for annotation is faster than typing words. To best reduce the manual cost of annotating photos, we propose a novel framework which utilizes the scarce spoken annotations recorded while capturing as voice labels and automatically label every facial image in the photo collection. To accomplish this goal, we employ a probabilistic graphical model which integrates voice labels and visual appearances for inference. Combined with group prior estimation and gender attribute association, we can achieve an outstanding performance on the proposed synthesized group photo collections.

查看译文

关键词

Face Annotation,Spoken Annotation,Image Retrieval

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要