Salience DETR: Enhancing Detection Transformer with Hierarchical Salience Filtering Refinement
CVPR 2024(2024)
摘要
DETR-like methods have significantly increased detection performance in an
end-to-end manner. The mainstream two-stage frameworks of them perform dense
self-attention and select a fraction of queries for sparse cross-attention,
which is proven effective for improving performance but also introduces a heavy
computational burden and high dependence on stable query selection. This paper
demonstrates that suboptimal two-stage selection strategies result in scale
bias and redundancy due to the mismatch between selected queries and objects in
two-stage initialization. To address these issues, we propose hierarchical
salience filtering refinement, which performs transformer encoding only on
filtered discriminative queries, for a better trade-off between computational
efficiency and precision. The filtering process overcomes scale bias through a
novel scale-independent salience supervision. To compensate for the semantic
misalignment among queries, we introduce elaborate query refinement modules for
stable two-stage initialization. Based on above improvements, the proposed
Salience DETR achieves significant improvements of +4.0
on three challenging task-specific detection datasets, as well as 49.2
COCO 2017 with less FLOPs. The code is available at
https://github.com/xiuqhou/Salience-DETR.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要