Medical Vision Language Pretraining: A survey
CoRR(2023)
摘要
Medical Vision Language Pretraining (VLP) has recently emerged as a promising
solution to the scarcity of labeled data in the medical domain. By leveraging
paired/unpaired vision and text datasets through self-supervised learning,
models can be trained to acquire vast knowledge and learn robust feature
representations. Such pretrained models have the potential to enhance multiple
downstream medical tasks simultaneously, reducing the dependency on labeled
data. However, despite recent progress and its potential, there is no such
comprehensive survey paper that has explored the various aspects and
advancements in medical VLP. In this paper, we specifically review existing
works through the lens of different pretraining objectives, architectures,
downstream evaluation tasks, and datasets utilized for pretraining and
downstream tasks. Subsequently, we delve into current challenges in medical
VLP, discussing existing and potential solutions, and conclude by highlighting
future directions. To the best of our knowledge, this is the first survey
focused on medical VLP.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要