Visual Sensemaking Needs Both Vision and Semantics: On Logic-Based Declarative Neurosymbolism for Reasoning about Space and Motion

Jakob Suchan,Mehul Bhatt, Srikrishna Varadarajan

ELECTRONIC PROCEEDINGS IN THEORETICAL COMPUTER SCIENCE(2023)

引用 0|浏览2
暂无评分
摘要
Contemporary artificial vision systems lack abilities for high-level, human-scale mental simulation, e.g., concerning perceived everyday multimodal interactions. Cognitively driven sensemaking functions such as embodied grounding for active vision, visuospatial concept formation, commonsense explanation, diagnostic introspection all remain fertile ground. We posit that developing high-level visual sensemaking capabilities requires a systematic, tight yet modular integration of (neural) visual processing techniques with high-level commonsense knowledge representation and reasoning methods pertaining to space, motion, actions, events, conceptual knowledge etc. As an exemplar of this thinking, we position recent work on deeply semantic, explainable, neurosymbolic visuospatial reasoning driven by an integration of methods in ( deep learning based) vision and (KR based) semantics. The positioned work is general, but its significance is demonstrated and empirically benchmarked in the context of (active, realtime) visual sensemaking for self-driving vehicles.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要