Interclass-Relativity-Adaptive Metric Learning For Cross-Modal Matching And Beyond
IEEE TRANSACTIONS ON MULTIMEDIA(2021)
摘要
Training under supervision of triplet ranking loss is a dominant methodology for cross-modal matching models, while good-performing losses in this domain are immensely under-explored since the majority of advanced metric losses are inapplicable due to the particularity of cross-modal setting. Current prominent approaches of metric learning have developed various weighting schemes that assign weights to separate positive or negative samples. It is the interclass relative order in a triplet, however, that matters. In this work, we propose a new Interclass-Relativity-Adaptive (IRA) loss that assigns weights to the relative similarities between positive and negative pairs instead of separate pairs, which allows us to regard a whole triplet as a weighable entity and achieve maximum utilization of sole positive under cross-modal setting. Our method outperforms the baselines by a large margin and obtains competitive results on two video-text matching benchmarks and two image-text matching benchmarks. We also further extend our method to two unimodal image retrieval benchmarks to test its generality and achieve new state-of-the-art results.
更多查看译文
关键词
Loss measurement, Task analysis, Benchmark testing, Image retrieval, Sampling methods, Semantics, Cross-modal matching, metric learning, image retrieval, sample weighting, interclass relativity
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络