Improving Instruction Following in Language Models Through Proxy-Based Uncertainty Estimation

JoonHo Lee,Jae Oh Woo, Juree Seok, Parisa Hassanzadeh, Wooseok Jang, JuYoun Son,Sima Didari, Baruch Gutow,Heng Hao,Hankyu Moon, Wenjun Hu,Yeong-Dae Kwon, Taehee Lee,Seungjai Min

ICML 2024（2024）

引用 0|浏览10

暂无评分

摘要

Assessing response quality to instructions in language models is vital butchallenging due to the complexity of human language across different contexts.This complexity often results in ambiguous or inconsistent interpretations,making accurate assessment difficult. To address this issue, we propose a novelUncertainty-aware Reward Model (URM) that introduces a robust uncertaintyestimation for the quality of paired responses based on Bayesian approximation.Trained with preference datasets, our uncertainty-enabled proxy not only scoresrewards for responses but also evaluates their inherent uncertainty. Empiricalresults demonstrate significant benefits of incorporating the proposed proxyinto language model training. Our method boosts the instruction followingcapability of language models by refining data curation for training andimproving policy optimization objectives, thereby surpassing existing methodsby a large margin on benchmarks such as Vicuna and MT-bench. These findingshighlight that our proposed approach substantially advances language modeltraining and paves a new way of harnessing uncertainty within language models.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要