NeuroCine: Decoding Vivid Video Sequences from Human Brain Activties
CoRR(2024)
摘要
In the pursuit to understand the intricacies of human brain's visual
processing, reconstructing dynamic visual experiences from brain activities
emerges as a challenging yet fascinating endeavor. While recent advancements
have achieved success in reconstructing static images from non-invasive brain
recordings, the domain of translating continuous brain activities into video
format remains underexplored. In this work, we introduce NeuroCine, a novel
dual-phase framework to targeting the inherent challenges of decoding fMRI
data, such as noises, spatial redundancy and temporal lags. This framework
proposes spatial masking and temporal interpolation-based augmentation for
contrastive learning fMRI representations and a diffusion model enhanced by
dependent prior noise for video generation. Tested on a publicly available fMRI
dataset, our method shows promising results, outperforming the previous
state-of-the-art models by a notable margin of 20.97%, 31.00% and
12.30% respectively on decoding the brain activities of three subjects in
the fMRI dataset, as measured by SSIM. Additionally, our attention analysis
suggests that the model aligns with existing brain structures and functions,
indicating its biological plausibility and interpretability.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要