Domain-Guided Masked Autoencoders for Unique Player Identification
arxiv(2024)
摘要
Unique player identification is a fundamental module in vision-driven sports
analytics. Identifying players from broadcast videos can aid with various
downstream tasks such as player assessment, in-game analysis, and broadcast
production. However, automatic detection of jersey numbers using deep features
is challenging primarily due to: a) motion blur, b) low resolution video feed,
and c) occlusions. With their recent success in various vision tasks, masked
autoencoders (MAEs) have emerged as a superior alternative to conventional
feature extractors. However, most MAEs simply zero-out image patches either
randomly or focus on where to mask rather than how to mask. Motivated by human
vision, we devise a novel domain-guided masking policy for MAEs termed d-MAE to
facilitate robust feature extraction in the presence of motion blur for player
identification. We further introduce a new spatio-temporal network leveraging
our novel d-MAE for unique player identification. We conduct experiments on
three large-scale sports datasets, including a curated baseball dataset, the
SoccerNet dataset, and an in-house ice hockey dataset. We preprocess the
datasets using an upgraded keyframe identification (KfID) module by focusing on
frames containing jersey numbers. Additionally, we propose a keyframe-fusion
technique to augment keyframes, preserving spatial and temporal context. Our
spatio-temporal network showcases significant improvements, surpassing the
current state-of-the-art by 8.58
respectively. Rigorous ablations highlight the effectiveness of our
domain-guided masking approach and the refined KfID module, resulting in
performance enhancements of 1.48
architectures.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要