Multimodal Reference Resolution In Collaborative Assembly Tasks.

MA3HMI@ICMI(2018)

引用 13|浏览13
暂无评分
摘要
Humans use verbal and non-verbal cues to communicate their intent in collaborative tasks. In situated dialogue, speakers typically direct their interlocutor's attention to referent objects using multimodal cues, and references to such entities are resolved in a collaborative nature. In this study we designed a multiparty task where humans teach each other how to assemble furniture, and captured eye-gaze, speech and pointing gestures. We analysed which multimodal cues carry the most information for resolving referring expressions, and report an object saliency classifier that using a multisensory input from speaker and addressee, detects the referent objects during the collaborative task.
更多
查看译文
关键词
referential eye gaze, grounding, human-robot interaction
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要