Retrieving Multimodal Information for Augmented Generation: A Survey.

Ruochen Zhao,Hailin Chen,Weishi Wang,Fangkai Jiao,Do Xuan Long,Chengwei Qin,Bosheng Ding,Xiaobao Guo,Minzhi Li,Xingxuan Li,Shafiq Joty

arxiv（2023）

引用 5|浏览125

暂无评分

摘要

In this survey, we review methods that retrieve multimodal knowledge to assist and augment generative models. This group of works focuses on retrieving grounding contexts from external sources, including images, codes, tables, graphs, and audio. As multimodal learning and generative AI have become more and more impactful, such retrieval augmentation offers a promising solution to important concerns such as factuality, reasoning, interpretability, and robustness. We provide an in-depth review of retrieval-augmented generation in different modalities and discuss potential future directions. As this is an emerging field, we continue to add new papers and methods.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要