Prediction-Powered Ranking of Large Language Models
CoRR(2024)
摘要
Large language models are often ranked according to their level of alignment
with human preferences – a model is better than other models if its outputs
are more frequently preferred by humans. One of the most popular ways to elicit
human preferences utilizes pairwise comparisons between the outputs provided by
different models to the same inputs. However, since gathering pairwise
comparisons by humans is costly and time-consuming, it has become a very common
practice to gather pairwise comparisons by a strong large language model – a
model strongly aligned with human preferences. Surprisingly, practitioners
cannot currently measure the uncertainty that any mismatch between human and
model preferences may introduce in the constructed rankings. In this work, we
develop a statistical framework to bridge this gap. Given a small set of
pairwise comparisons by humans and a large set of pairwise comparisons by a
model, our framework provides a rank-set – a set of possible ranking positions
– for each of the models under comparison. Moreover, it guarantees that, with
a probability greater than or equal to a user-specified value, the rank-sets
cover the true ranking consistent with (the distribution of) human pairwise
preferences. Our framework is computationally efficient, easy to use, and does
not make any assumption about the distribution of human preferences nor about
the degree of alignment between the pairwise comparisons by the humans and the
strong large language model.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要