SpotFormer: A Transformer-based Framework for Precise Soccer Action Spotting

2022 IEEE 24TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP)(2022)

引用 0|浏览3
暂无评分
摘要
Action spotting and classification consist in detecting the exact moments at which events occur in long videos. The current mainstream spotting practices generally use a two-stage pipeline that performs feature collection and integration, then salient action detection and postprocessing. Following that, we present SpotFormer, a simple yet effective framework, capable of precise action spotting. Specifically, we employ several most advanced backbone networks as auxiliary feature extractors, and reduce feature dimensionality in a straightforward and efficient way. The frame-wise features are fed into a transformer-based spotting network devised to leverage spatiotemporal information. We obtain 0.609 tight mAP score via model ensemble and achieve the state-of-the-art performance on the SoccerNet-v2 dataset.
更多
查看译文
关键词
video understanding,action spotting,action recognition
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要