MedM2G: Unifying Medical Multi-Modal Generation via Cross-Guided Diffusion with Visual Invariant
CVPR 2024(2024)
摘要
Medical generative models, acknowledged for their high-quality sample
generation ability, have accelerated the fast growth of medical applications.
However, recent works concentrate on separate medical generation models for
distinct medical tasks and are restricted to inadequate medical multi-modal
knowledge, constraining medical comprehensive diagnosis. In this paper, we
propose MedM2G, a Medical Multi-Modal Generative framework, with the key
innovation to align, extract, and generate medical multi-modal within a unified
model. Extending beyond single or two medical modalities, we efficiently align
medical multi-modal through the central alignment approach in the unified
space. Significantly, our framework extracts valuable clinical knowledge by
preserving the medical visual invariant of each imaging modal, thereby
enhancing specific medical information for multi-modal generation. By
conditioning the adaptive cross-guided parameters into the multi-flow diffusion
framework, our model promotes flexible interactions among medical multi-modal
for generation. MedM2G is the first medical generative model that unifies
medical generation tasks of text-to-image, image-to-text, and unified
generation of medical modalities (CT, MRI, X-ray). It performs 5 medical
generation tasks across 10 datasets, consistently outperforming various
state-of-the-art works.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要