Other Agents' Actions as Asynchronous Events
AAAI Spring Symposium: Distributed Plan and Schedule Management(2006)
摘要
An individual planning agent does not generally have sufficient computational resources at its disposal to pro- duce an optimal plan in a complex domain, as delibera- tion itself requires and consumes scarce resources. This problem is further exacerbated in a distributed plan- ning context in which multiple, heterogeneous agents must expend a portion of their resource allotment on communication, negotiation, and shared planning ac- tivities with other cooperative agents. Because other agents can have different temporal grain sizes, plan- ning horizons, deadlines, and access to distinct local information, the delays associated with local delibera- tion and, in turn, shared negotiation are asynchronous, unpredictable, and widely variable. We address this problem using a principled, decision- theoretic approach based on recent advances in Gen- eralized Semi-Markov Decision Processes (GSMDPs). In particular, we use GSMDPs to model agents who develop a continuous-time deliberation policy offline which can then be consulted to dynamically select both deliberation-level and domain-level actions at plan ex- ecution time. This scheme allows individual agents to model other cooperative agents' actions essentially as asynchronous events, e.g., that might or might not ful- fill a request (uncertain effect) after a stochastically- determined delay (uncertain event duration). With this approach, the decision-theoretic planner for the individ- ual agent can make near-optimal execution-time deci- sions that trade off the risks and opportunities associ- ated with their own actions, other agents' actions, and asynchronous external threats. abstraction than the base domain called the meta do- main, with the higher-level planning sometimes referred to as metaplanning, or metacognition. In general, deliberation scheduling involves decid- ing what aspects of an artifact (e.g., the agent's plan) should be improved, what methods of improvement should be chosen, and how much time should be de- voted to each of these activities. In particular, we define deliberation scheduling to be a form of metacognition in which two planners exist: one, a base-level planner that attempts to solve planning problems in the base do- main, and two, a meta-level planner deciding how best to instruct the base-level planner to expend units of planning effort. Both the meta and base domains are stochastic. Actions in the meta domain consist of a set of base domain problem configurations from which to choose, each of which constitutes a planning problem of varying difficulty (the successful result of which is a plan of corresponding quality), which might or might not be solvable, and which takes an unknown (but prob- abilistic) amount of time to complete. Similarly, the base domain's events and actions can succeed or fail, and have continuously-distributed durations. The goal of the meta-level planner is to schedule the deliberation effort available to the base-level planner to maximize the expected utility of the base domain plans. This complex problem is further complicated by the fact that base planning and execution happen concurrently, further constraining the resources being allocated by the meta-level planner.
更多查看译文
关键词
grain size,expected utility
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络