A Data Generation Perspective to the Mechanism of In-Context Learning
CoRR(2024)
摘要
In-Context Learning (ICL) empowers Large Language Models (LLMs) with the
capacity to learn in context, achieving downstream generalization without
gradient updates but with a few in-context examples. Despite the encouraging
empirical success, the underlying mechanism of ICL remains unclear, and
existing research offers various viewpoints of understanding. These studies
propose intuition-driven and ad-hoc technical solutions for interpreting ICL,
illustrating an ambiguous road map. In this paper, we leverage a data
generation perspective to reinterpret recent efforts and demonstrate the
potential broader usage of popular technical solutions, approaching a
systematic angle. For a conceptual definition, we rigorously adopt the terms of
skill learning and skill recognition. The difference between them is skill
learning can learn new data generation functions from in-context data. We also
provide a comprehensive study on the merits and weaknesses of different
solutions, and highlight the uniformity among them given the perspective of
data generation, establishing a technical foundation for future research to
incorporate the strengths of different lines of research.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要