Towards Causal Foundation Model: on Duality between Causal Inference and Attention
arxiv(2023)
摘要
Foundation models have brought changes to the landscape of machine learning,
demonstrating sparks of human-level intelligence across a diverse array of
tasks. However, a gap persists in complex tasks such as causal inference,
primarily due to challenges associated with intricate reasoning steps and high
numerical precision requirements. In this work, we take a first step towards
building causally-aware foundation models for complex tasks. We propose a
novel, theoretically sound method called Causal Inference with Attention
(CInA), which utilizes multiple unlabeled datasets to perform self-supervised
causal learning, and subsequently enables zero-shot causal inference on unseen
tasks with new data. This is based on our theoretical results that demonstrate
the primal-dual connection between optimal covariate balancing and
self-attention, facilitating zero-shot causal inference through the final layer
of a trained transformer-type architecture. We demonstrate empirically that our
approach CInA effectively generalizes to out-of-distribution datasets and
various real-world datasets, matching or even surpassing traditional
per-dataset causal inference methodologies.
更多查看译文
AI 理解论文
溯源树
样例
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要