A Comparative Analysis of Visual Encoding Models Based on Classification and Segmentation Task-Driven CNNs.

COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE(2020)

引用 2|浏览21
暂无评分
摘要
Nowadays, visual encoding models use convolution neural networks (CNNs) with outstanding performance in computer vision to simulate the process of human information processing. However, the prediction performances of encoding models will have differences based on different networks driven by different tasks. Here, the impact of network tasks on encoding models is studied. Using functional magnetic resonance imaging (fMRI) data, the features of natural visual stimulation are extracted using a segmentation network (FCN32s) and a classification network (VGG16) with different visual tasks but similar network structure. Then, using three sets of features, i.e., segmentation, classification, and fused features, the regularized orthogonal matching pursuit (ROMP) method is used to establish the linear mapping from features to voxel responses. The analysis results indicate that encoding models based on networks performing different tasks can effectively but differently predict stimulus-induced responses measured by fMRI. The prediction accuracy of the encoding model based on VGG is found to be significantly better than that of the model based on FCN in most voxels but similar to that of fused features. The comparative analysis demonstrates that the CNN performing the classification task is more similar to human visual processing than that performing the segmentation task.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要