Efficient Action Counting with Dynamic Queries
CoRR(2024)
摘要
Temporal repetition counting aims to quantify the repeated action cycles
within a video. The majority of existing methods rely on the similarity
correlation matrix to characterize the repetitiveness of actions, but their
scalability is hindered due to the quadratic computational complexity. In this
work, we introduce a novel approach that employs an action query representation
to localize repeated action cycles with linear computational complexity. Based
on this representation, we further develop two key components to tackle the
essential challenges of temporal repetition counting. Firstly, to facilitate
open-set action counting, we propose the dynamic update scheme on action
queries. Unlike static action queries, this approach dynamically embeds video
features into action queries, offering a more flexible and generalizable
representation. Secondly, to distinguish between actions of interest and
background noise actions, we incorporate inter-query contrastive learning to
regularize the video representations corresponding to different action queries.
As a result, our method significantly outperforms previous works, particularly
in terms of long video sequences, unseen actions, and actions at various
speeds. On the challenging RepCountA benchmark, we outperform the
state-of-the-art method TransRAC by 26.5
error decrease and 94.1
https://github.com/lizishi/DeTRC.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要