Can deep neural networks for intrinsic image decomposition model human lightness constancy?

Alban Flachot, Jaykishan Patel, Karishma Patel, T. E. Wallis,Marcus A. Brubaker,David H. Brainard, Robert K. Murray

Journal of Vision（2023）

引用 0|浏览2

暂无评分

摘要

A challenge in vision science is understanding how the visual system parses the retinal image to represent intrinsic properties of scenes, such as surface reflectance and lighting. Deep learning networks have provided successful new approaches to inferring intrinsic images, and here we investigate these networks as models of human lightness constancy. We examined two state-of-the-art architectures for intrinsic image decomposition (Yu & Smith, 2019; Li et al., 2020), trained on photorealistic images of synthetic scenes. To compare network and human performance, we measured the networks’ estimates of surface reflectance using Mondrian patterns embedded in an indoor scene. A reference patch was shown under a fixed illuminant, and multiple test patches were shown under five different illumination levels. At each illumination level, we rendered 17 reflectance levels of the test patch and interpolated the networks' estimates for each to find a reflectance match to the reference, thus probing the networks’ lightness constancy. We repeated this procedure for three different reference reflectances. We also tested human observers in a corresponding lightness matching task, using the same stimuli presented with a virtual reality display. Human observers showed good lightness constancy, with an average constancy index (CI) of 0.81 across all stimuli. They were also consistent across conditions, with CI standard deviations around 0.10 across reflectance and lighting conditions. The deep learning networks, however, showed poor reflectance constancy, with an average CI of 0.19. Qualitative analysis suggests that the networks often misinterpreted lighting changes as reflectance changes. The networks were also less consistent than humans, with CI standard deviations of 0.34 and 0.21 across reflectance and lighting conditions, respectively. These results show that these deep learning networks do not fully model human lightness constancy. We will discuss potential strategies to address this shortcoming, as well as proposals for further benchmarking such networks.

查看译文

关键词

deep neural networks,neural networks,decomposition

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要