Reinforced In-Context Black-Box Optimization
CoRR(2024)
摘要
Black-Box Optimization (BBO) has found successful applications in many fields
of science and engineering. Recently, there has been a growing interest in
meta-learning particular components of BBO algorithms to speed up optimization
and get rid of tedious hand-crafted heuristics. As an extension, learning the
entire algorithm from data requires the least labor from experts and can
provide the most flexibility. In this paper, we propose RIBBO, a method to
reinforce-learn a BBO algorithm from offline data in an end-to-end fashion.
RIBBO employs expressive sequence models to learn the optimization histories
produced by multiple behavior algorithms and tasks, leveraging the in-context
learning ability of large models to extract task information and make decisions
accordingly. Central to our method is to augment the optimization histories
with regret-to-go tokens, which are designed to represent the performance of an
algorithm based on cumulative regret of the histories. The integration of
regret-to-go tokens enables RIBBO to automatically generate sequences of query
points that satisfy the user-desired regret, which is verified by its
universally good empirical performance on diverse problems, including BBOB
functions, hyper-parameter optimization and robot control problems.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要