POEM: A Personalized Online Education Scheme Based on Reinforcement Learning

2020 IEEE International Conference on Teaching, Assessment, and Learning for Engineering (TALE)(2020)

引用 2|浏览2
暂无评分
摘要
As online e-learning systems become more prevalent, there is a growing need for them to accommodate individual differences among students. According to the concept of zone of proximal development (ZPD), it is imperative to provide online students with educational contents that are neither too easy nor too difficult, but are slightly beyond their current abilities. However, following ZPD rule is challenging in online e-learning system, due to the following reasons: the system does not know a priori the ability of the online students, especially for the newly arrived student; the exact relationship between student feedback on teaching and their abilities (i.e., reward/gain function) is extremely complicated, and even unknown to each student. Aiming at solving the issue above, this paper proposes a personalized educational scheme to students, POEM, in order to maximize their accumulative learning gains over multiple rounds. Specifically, instead of assuming any specific formal reward function, we first estimate any unknown reward function from noisy samples using Gaussian process (GP) model. Then, the multi-arm bandit based algorithm is used to select the teaching content with the adaptive difficulty level to balance the effect of exploration and exploitation. The simulation results demonstrate the effectiveness of our proposed method.
更多
查看译文
关键词
personalized education,reinforcement learning,Gaussian process,multi-armed bandit,zone of proximal development (ZPD)
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要