Gesture Annotation With a Visual Search Engine for Multimodal Communication Research.

THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE(2018)

引用 24|浏览29
暂无评分
摘要
Human communication is multimodal and includes elements such as gesture and facial expression along with spoken language. Modern technology makes it feasible to capture all such aspects of communication in natural settings. As a result, similar to fields such as genetics, astronomy and neuroscience, scholars in areas such as linguistics and communication studies are on the verge of a data-driven revolution in their fields. These new approaches require analytical support from machine learning and artificial intelligence to develop tools to help process the vast data repositories. The Distributed Little Red Hen Lab project is an international team of interdisciplinary researchers building a large-scale infrastructure for data-driven multimodal communications research. In this paper, we describe a machine learning system developed to automatically annotate a large database of television program videos as part of this project. The annotations mark regions where people or speakers are on screen along with body part motions including head, hand and shoulder motion. We also annotate a specific class of gestures known as timeline gestures. An existing gesture annotation tool, ELAN, can be used with these annotations to quickly locate gestures of interest. Finally, we provide an update mechanism for the system based on human feedback. We empirically evaluate the accuracy of the system as well as present data from pilot human studies to show its effectiveness at aiding gesture scholars in their work.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要