ADOSMNet: a novel visual affordance detection network with object shape mask guided feature encoders

MULTIMEDIA TOOLS AND APPLICATIONS(2023)

引用 0|浏览0
暂无评分
摘要
Visual affordance detection aims to understand the functional attributes of objects, which is crucial for robots to achieve interactive tasks. Most existing affordance detection methods mainly utilize the global image features for affordance detection while do not fully exploit the features of local relevant objects in the image, which often leads to suboptimal detection accuracy under the interference of cluttered backgrounds and neighbour objects. Numerous researches have proved that the accuracy of affordance detection largely depends on the quality of extracted image features. In this paper, we propose a novel affordance detection network with object shape mask guided feature encoders. The masks play as an attention mechanism that enforce the network to focus on the shape regions of target objects in the image, which facilitate to obtain high-quality features. Specifically, we first propose a shape mask guided encoder, which uses masks to effectively locate all target objects so as to extract more expressive features. Based on the encoder, we then propose a dual enhance feature aggregation module, which consists of two branches. The first branch encodes the global features of the original image, while the second branch locates each local relevant object and encodes its precise features. Aggregating these features enhances the feature representation of each object, further improving feature quality and suppressing interference. Quantitative and qualitative evaluations compared with state-of-the-art methods demonstrate that the proposed method achieves superior performance on the two commonly used affordance detection datasets.
更多
查看译文
关键词
Visual affordance detection,Object shape mask,Feature representation,Feature enhancement,Image segmentation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要