BiomedGPT: A Unified and Generalist Biomedical Generative Pre-trained Transformer for Vision, Language, and Multimodal Tasks

Kai Zhang,Jun Yu,Eashan Adhikarla,Rong Zhou, Zhiling Yan, Yixin Liu, Zhengliang Liu,Lifang He,Brian Davison,Xiang Li, Hui Ren, Sunyang Fu, James Zou, Wei Liu,Jing Huang,Chen Chen,Yuyin Zhou, Tianming Liu, Xun Chen, Yong Chen, Quanzheng Li, Hongfang Liu,Lichao Sun

CoRR（2023）

引用 47|浏览172

暂无评分

摘要

Conventional task- and modality-specific artificial intelligence (AI) models are inflexible in real-world deployment and maintenance for biomedicine. At the same time, the growing availability of biomedical data, coupled with the advancements in modern multi-modal multi-task AI techniques, has paved the way for the emergence of generalist biomedical AI solutions. These solutions hold the potential to interpret different medical modalities and produce expressive outputs such as free-text reports or disease diagnosis. Here, we propose BiomedGPT, the first open-source and generalist visual language AI for diverse biomedical tasks. BiomedGPT achieved 16 state-of-the-art results across five clinically significant tasks on 26 datasets. Notably, it outperformed OpenAI's GPT-4 with vision (GPT-4V) in radiology human evaluation and surpassed Google's Med-PaLM M (12B) in breast cancer diagnosis and medical visual question answering. Moreover, BiomedGPT facilitates zero-shot transfer learning, greatly enhancing its utility as a biomedical assistant, similar to ChatGPT. Our method demonstrates effective training with diverse datasets can lead to more practical biomedical AI.

查看译文

关键词

multimodal tasks,transformer,vision,unified,pre-trained

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要