Grounding Language for Robotic Manipulation via Skill Library

2023 2nd International Conference on Machine Learning, Cloud Computing and Intelligent Mining (MLCCIM)(2023)

引用 0|浏览0
暂无评分
摘要
Given the language instructions and a raw image, how can we enable robots to reason about semantic concepts and manipulate objects accordingly? Recent research on language-conditioned manipulation tasks has introduced end-to-end frameworks that combine the semantic understanding with the precise spatial reasoning. But these works require lots of training data and fail to generalize to more complex task scenes. We propose a novel Language-Goal-Skill architecture that decouples language-visual grounding and skill learning, which is more effective and generalizable. It leverages pre-trained models to infer manipulation skills, the scene objects and spatial relations, builds a skill library for diverse task scenes. Experiments in simulated settings suggest that our approach achieve a higher success rate for multi skills compared with baseline methods.
更多
查看译文
关键词
language-conditioned manipulation,language-visual grounding,goal-conditioned skill
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要