Semi-Supervised Crowd Counting with Contextual Modeling: Facilitating Holistic Understanding of Crowd Scenes
arxiv(2023)
摘要
To alleviate the heavy annotation burden for training a reliable crowd
counting model and thus make the model more practicable and accurate by being
able to benefit from more data, this paper presents a new semi-supervised
method based on the mean teacher framework. When there is a scarcity of labeled
data available, the model is prone to overfit local patches. Within such
contexts, the conventional approach of solely improving the accuracy of local
patch predictions through unlabeled data proves inadequate. Consequently, we
propose a more nuanced approach: fostering the model's intrinsic 'subitizing'
capability. This ability allows the model to accurately estimate the count in
regions by leveraging its understanding of the crowd scenes, mirroring the
human cognitive process. To achieve this goal, we apply masking on unlabeled
data, guiding the model to make predictions for these masked patches based on
the holistic cues. Furthermore, to help with feature learning, herein we
incorporate a fine-grained density classification task. Our method is general
and applicable to most existing crowd counting methods as it doesn't have
strict structural or loss constraints. In addition, we observe that the model
trained with our framework exhibits a 'subitizing'-like behavior. It accurately
predicts low-density regions with only a 'glance', while incorporating local
details to predict high-density regions. Our method achieves the
state-of-the-art performance, surpassing previous approaches by a large margin
on challenging benchmarks such as ShanghaiTech A and UCF-QNRF. The code is
available at: https://github.com/cha15yq/MRC-Crowd.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要