Goal Alignment: Re-analyzing Value Alignment Problems Using Human-Aware AI
AAAI 2024(2024)
Abstract
While the question of misspecified objectives has gotten much attention in recent years, most works in this area primarily focus on the challenges related to the complexity of the objective specification mechanism (for example, the use of reward functions). However, the complexity of the objective specification mechanism is just one of many reasons why the user may have misspecified their objective. A foundational cause for misspecification that is being overlooked by these works is the inherent asymmetry in human expectations about the agent's behavior and the behavior generated by the agent for the specified objective. To address this, we propose a novel formulation for the objective misspecification problem that builds on the human-aware planning literature, which was originally introduced to support explanation and explicable behavioral generation. Additionally, we propose a first-of-its-kind interactive algorithm that is capable of using information generated under incorrect beliefs about the agent to determine the true underlying goal of the user.
MoreTranslated text
Key words
HAI: Human-Aware Planning and Behavior Prediction,HAI: Interaction Techniques and Devices,HAI: Learning Human Values and Preferences,PEAI: Safety, Robustness & Trustworthiness,PRS: Activity and Plan Recognition
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined