Enforcing Predictive Invariance across Structured Biomedical Domains

Wengong Jin,Regina Barzilay,Tommi S. Jaakkola

user-5fe1a78c4c775e6ec07359f9（2021）

引用 6|浏览39

暂无评分

摘要

Many biochemical applications such as molecular property prediction require models to generalize beyond their training domains (environments). Moreover, natural environments in these tasks are structured, defined by complex descriptors such as molecular scaffolds or protein families. Therefore, most environments are either never seen during training, or contain only a single training example. To address these challenges, we propose a new regret minimization (RGM) algorithm and its extension for structured environments. RGM builds from invariant risk minimization (IRM) by recasting simultaneous optimality condition in terms of predictive regret, finding a representation that enables the predictor to compete against an oracle with hindsight access to held-out environments. The structured extension adaptively highlights variation due to complex environments via specialized domain perturbations. We evaluate our method on multiple applications: molecular property prediction, protein homology and stability prediction and show that RGM significantly outperforms previous state-of-the-art baselines.

查看译文

关键词

Regret,Oracle,Invariant (physics),Invariant (mathematics),Theoretical computer science,Minification,Computer science,Hindsight bias,Protein homology,Regret minimization

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要