AutoGPT+P: Affordance-based Task Planning with Large Language Models
CoRR(2024)
摘要
Recent advances in task planning leverage Large Language Models (LLMs) to
improve generalizability by combining such models with classical planning
algorithms to address their inherent limitations in reasoning capabilities.
However, these approaches face the challenge of dynamically capturing the
initial state of the task planning problem. To alleviate this issue, we propose
AutoGPT+P, a system that combines an affordance-based scene representation with
a planning system. Affordances encompass the action possibilities of an agent
on the environment and objects present in it. Thus, deriving the planning
domain from an affordance-based scene representation allows symbolic planning
with arbitrary objects. AutoGPT+P leverages this representation to derive and
execute a plan for a task specified by the user in natural language. In
addition to solving planning tasks under a closed-world assumption, AutoGPT+P
can also handle planning with incomplete information, e. g., tasks with missing
objects by exploring the scene, suggesting alternatives, or providing a partial
plan. The affordance-based scene representation combines object detection with
an automatically generated object-affordance-mapping using ChatGPT. The core
planning tool extends existing work by automatically correcting semantic and
syntactic errors. Our approach achieves a success rate of 98
current 81
method SayCan on the SayCan instruction set. Furthermore, we evaluated our
approach on our newly created dataset with 150 scenarios covering a wide range
of complex tasks with missing objects, achieving a success rate of 79
dataset. The dataset and the code are publicly available at
https://git.h2t.iar.kit.edu/birr/autogpt-p-standalone.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要