Enhancing Text-Image Person Retrieval Through Nuances Varied Sample

Jiaer Xia, Haozhe Yang, Yan Zhang,Pingyang Dai

PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT I(2024)

引用 0|浏览4
暂无评分
摘要
Text-image person retrieval is a task that involves searching for a specific individual based on a corresponding textual description. However, a key challenge in this task is achieving modal alignment while conducting fine-grained retrieval. Current methods utilize classification and metric losses to enhance discrimination and alignment. Nevertheless, the substantial dissimilarities between samples often impede the network's capacity to learn discriminative fine-grained information. To tackle this issue and enable the network to focus on intricate details, we introduce the Nuanced Variation Module (NVM). This module generates artificially difficult negative samples, which serve as a guide for directing the network's attention towards discerning nuances. The incorporation of NVM-constructed hard-negative samples enhances the alignment loss and facilitates the network's attentiveness to details. Additionally, we leverage the image text matching task to explicitly augment the network's fine-grained ability. By adopting our NVM method, the network can extract an ample amount of fine-grained features, thereby mitigating the interference caused by challenging negative samples. Extensive experiments demonstrate that our proposed method achieves competitive performance compared to state-of-the-art approaches on publicly available datasets.
更多
查看译文
关键词
Text-image person retrieval,Text-based person re-identification
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要