Multimodal Reference Resolution In Collaborative Assembly Tasks.

Dimosthenis Kontogiorgos,Elena Sibirtseva,Andre Pereira,Gabriel Skantze,Joakim Gustafson

MA3HMI@ICMI（2018）

引用 13|浏览13

暂无评分

摘要

Humans use verbal and non-verbal cues to communicate their intent in collaborative tasks. In situated dialogue, speakers typically direct their interlocutor's attention to referent objects using multimodal cues, and references to such entities are resolved in a collaborative nature. In this study we designed a multiparty task where humans teach each other how to assemble furniture, and captured eye-gaze, speech and pointing gestures. We analysed which multimodal cues carry the most information for resolving referring expressions, and report an object saliency classifier that using a multisensory input from speaker and addressee, detects the referent objects during the collaborative task.

查看译文

关键词

referential eye gaze, grounding, human-robot interaction

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要