SpotFormer: A Transformer-based Framework for Precise Soccer Action Spotting

Mengqi Cao,Min Yang,Guozhen Zhang,Xiaotian Li,Yilu Wu,Gangshan Wu,Limin Wang

2022 IEEE 24TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP)（2022）

引用 0|浏览3

暂无评分

摘要

Action spotting and classification consist in detecting the exact moments at which events occur in long videos. The current mainstream spotting practices generally use a two-stage pipeline that performs feature collection and integration, then salient action detection and postprocessing. Following that, we present SpotFormer, a simple yet effective framework, capable of precise action spotting. Specifically, we employ several most advanced backbone networks as auxiliary feature extractors, and reduce feature dimensionality in a straightforward and efficient way. The frame-wise features are fed into a transformer-based spotting network devised to leverage spatiotemporal information. We obtain 0.609 tight mAP score via model ensemble and achieve the state-of-the-art performance on the SoccerNet-v2 dataset.

查看译文

关键词

video understanding,action spotting,action recognition

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要