谷歌浏览器插件
订阅小程序
在清言上使用

VRP-SAM: SAM with Visual Reference Prompt

2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)(2024)

引用 0|浏览66
暂无评分
摘要
In this paper, we propose a novel Visual Reference Prompt (VRP) encoder thatempowers the Segment Anything Model (SAM) to utilize annotated reference imagesas prompts for segmentation, creating the VRP-SAM model. In essence, VRP-SAMcan utilize annotated reference images to comprehend specific objects andperform segmentation of specific objects in target image. It is note that theVRP encoder can support a variety of annotation formats for reference images,including point, box, scribble, and mask.VRP-SAM achieves a breakthrough within the SAM framework by extending itsversatility and applicability while preserving SAM's inherent strengths, thusenhancing user-friendliness. To enhance the generalization ability of VRP-SAM,the VRP encoder adopts a meta-learning strategy. To validate the effectivenessof VRP-SAM, we conducted extensive empirical studies on the Pascal and COCOdatasets. Remarkably, VRP-SAM achieved state-of-the-art performance in visualreference segmentation with minimal learnable parameters. Furthermore, VRP-SAMdemonstrates strong generalization capabilities, allowing it to performsegmentation of unseen objects and enabling cross-domain segmentation.
更多
查看译文
关键词
Visual Reference,Visual Prompts,Image Object,Target Image,Reference Image,Learnable Parameters,Generalization Capability,Segmentation Performance,Object Segmentation,Scribble,COCO Dataset,Annotation Format,Semantic,Training Set,Image Features,Bounding Box,Target Object,Segmentation Results,Segmentation Task,Random Initialization,Image Encoder,Dice Loss,Foundation Model,Binary Cross Entropy,Binary Cross-entropy Loss,Vision Transformer,Style Image,Self-attention Layer,Base Classes
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要