Gesture Enhanced Comprehension of Ambiguous Human-to-Robot Instructions

Dulanga Weerakoon,Vigneshwaran Subbaraju,Nipuni Karumpulli,Tuan Tran,Qianli Xu,U-Xuan Tan,Joo Hwee Lim,Archan Misra

ICMI '20: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION Virtual Event Netherlands October, 2020（2020）

引用 8|浏览44

暂无评分

摘要

This work demonstrates the feasibility and benefits of using pointing gestures, a naturally-generated additional input modality, to improve the multi-modal comprehension accuracy of human instructions to robotic agents for collaborative tasks.We present M2Gestic, a system that combines neural-based text parsing with a novel knowledge-graph traversal mechanism, over a multi-modal input of vision, natural language text and pointing. Via multiple studies related to a benchmark table top manipulation task, we show that (a) M2Gestic can achieve close-to-human performance in reasoning over unambiguous verbal instructions, and (b) incorporating pointing input (even with its inherent location uncertainty) in M2Gestic results in a significant (30%) accuracy improvement when verbal instructions are ambiguous.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要