Improving Instruction Following in Language Models Through Proxy-Based Uncertainty Estimation
ICML 2024(2024)
摘要
Assessing response quality to instructions in language models is vital butchallenging due to the complexity of human language across different contexts.This complexity often results in ambiguous or inconsistent interpretations,making accurate assessment difficult. To address this issue, we propose a novelUncertainty-aware Reward Model (URM) that introduces a robust uncertaintyestimation for the quality of paired responses based on Bayesian approximation.Trained with preference datasets, our uncertainty-enabled proxy not only scoresrewards for responses but also evaluates their inherent uncertainty. Empiricalresults demonstrate significant benefits of incorporating the proposed proxyinto language model training. Our method boosts the instruction followingcapability of language models by refining data curation for training andimproving policy optimization objectives, thereby surpassing existing methodsby a large margin on benchmarks such as Vicuna and MT-bench. These findingshighlight that our proposed approach substantially advances language modeltraining and paves a new way of harnessing uncertainty within language models.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要