Tractable POMDP Planning Algorithms for Optimal Teaching in "SPAIS

Georgios Theocharous,Richard Beckwith,Nicholas Butko,Matthai Philipose

msra（2009）

引用 30|浏览15

暂无评分

摘要

In this paper, we develop a system for teaching the task of sorting a set of virtual coins. Teaching is a challenging domain for AI systems because three problems must be solved at once: a teacher must si- multaneously infer both social variables (attention, boredom, confusion, expertise, aptitude), as well as physical ones (task progress, objects being used, current activity), and finally she must combine this knowledge to plan effective moment-to-moment in- teraction strategies. We develop a framework called SPAIS (Socially and Physically Aware Interaction Systems), in which Social Variables define the tran- sition probabilities of a POMDP whose states are Physical Variables. Optimal Teaching with SPAIS corresponds to solving an optimal policy in a very large factored POMDP that combines both types of variables, a difficult computational problem. To make the POMDP approach more tractable we de- vised a policy-switching methodology among sim- pler POMDP solutions, each one representing the best way to teach a different type of student (set of Social Variables). Our algorithms switch between prototypical states of pupils either based on Social Variables likelihood, or simply using the reward signal in algorithms for online learning with expert advice. In our results we demonstrate a system for teaching the task by prompting in an optimal way. Second, we show that our policy switching algo- rithms can produce POMDP policies with equiva- lent teaching performance to the complete, single model approach in a fraction of the time.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要